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Abstract 

We study self-replicating molecules under externally varying conditions. Changing 
conditions such as temperature variations and/or alterations in the environment's 
resource composition lead to both non-constant replication and decay rates of the 
molecules. In general, therefore, molecular evolution takes place in a dynamic rather 
than a static fitness landscape. We incorporate dynamic replication and decay rates 
into the standard quasispecies theory of molecular evolution, and show that for 
periodic time-dependencies, a system of evolving molecules enters a limit cycle for 
t —* oo. For fast periodic changes, we show that molecules adapt to the time- 
averaged fitness landscape, whereas for slow changes they track the variations in the 
landscape arbitrarily closely. We derive a general approximation method that allows 
us to calculate the attractor of time-periodic landscapes, and demonstrate using 
several examples that the results of the approximation and the limiting cases of very 
slow and very fast changes are in perfect agreement. We also discuss landscapes with 
arbitrary time dependencies, and show that very fast changes again lead to a system 
that adapts to the time-averaged landscape. Finally, we analyze the dynamics of a 
finite population of molecules in a dynamic landscape, and discuss its relation to 
the infinite population limit. 
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1 Introduction 



Eigen's quasispecies model [1] has been the basis of a vivid branch of molecular 
evolution theory ever since it has been put forward almost 30 years ago [2- 
19]. Its two main statements are the formation of a quasispecies consisting of 
several molecular species with well defined concentrations, and the existence 
of an error threshold above which all information is lost because of accu- 
mulating erroneous mutations. Both effects have been found in a number of 
experimental as well as theoretical studies (For experimental evidence, see, 
e.g., [20] for the formation of a quasispecies in the RNA of the Q/3 phage, 
[21] for the observation of an error threshold in a system of self-replicating 
computer programs, and [22] for a more recent example of quasispecies for- 
mation in the Hepatitis C virus. Generally, see the reviews [11,13,19] and the 
references therein). Recently, a new aspect of the quasispecies model has been 
brought into consideration that was mostly missing in previous works, namely 
the aspect of a dynamic fitness landscape [17,18,23-25] The notion "Dynamic 
fitness landscape" encompasses all situations in which the replication and/or 
decay rates of the molecules change over time. In the present work, we are 
interested in situations where these changes occur as an external influence for 
the evolving system, without feedback from the system to the dynamics of 
the landscape. Dynamic fitness landscapes of this kind are important, because 
almost any biological system is subject to external changes in the form of, e.g., 
daytime/nighttime, seasons, long-term climatic changes, geographic changes 
due to tectonic movements, to name just a few. 

The main problem one encounters when dealing with dynamic landscapes is 
the difficulty to find a good generalization of the quasispecies concept. In the 
original work of Eigen, the quasispecies is the equilibrium distribution of the 
different molecular species. It is reached if the system is left undisturbed for 
a sufficiently long time. Since in a dynamic landscape the system is being 
disturbed by the landscape itself, the concept of a quasispecies is meaningless 
in the general case. However, there are special cases in which a meaningful 
quasispecies can be defined. If, for example, the landscape changes on a much 
slower time scale than what the system needs to reach the equilibrium, then 
the system is virtually in equilibrium all the time, and the concentrations at 
time t are determined from the landscape present at that time. Generally, 
it is certain symmetries in the dynamics of the landscape that allow for the 
definition of a quasispecies. One example we treat in this paper in detail is the 
case of time-periodic landscapes, which offer a natural quasispecies definition. 

An early investigation of dynamic landscapes has been carried out by Jones [4,5], 
who has only considered cases in which all replication rates change by a com- 
mon factor. Therefore, this approach excludes (among other cases) in particu- 
lar all situations in which the order of the molecules' replication rates changes 
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over time, i.e., in which for example one of the faster replicating molecules 
becomes one of the slower replicating molecules and vice versa. Recent work 
on dynamic fitness landscapes allow for such changes. Wilke et al. [26,17] 
have developed a framework that allows to define and to calculate numeri- 
cally a quasispecies in time-periodic landscapes. Independently, Nilsson and 
Snoad [18] have studied the particular example of a stochastically jumping 
peak in an otherwise flat landscape. This work has been generalized by Ron- 
newinkel et al. [23], who could also define a meaningful quasispecies for a 
deterministic version of the jumping peak landscape and related landscapes. 
Very recent work of Nilsson and Snoad deals with self-adaptation of the muta- 
tion rate in a jumping peak landscape [25], and with the low-pass filter effect 
of dynamic fitness landscapes [24], a notion put forward by Hirst [27,28]. Fi- 
nally, theoretical studies of dynamic fitness landscapes can also be found in 
the related field of genetic algorithms. Schmitt et al. [29,30] derive results for 
finite populations in a relatively broad class of dynamic landscapes. However, 
only landscapes in which the fitnesses get scaled can be treated, so that the 
same restriction applies here that applied to Jones's work. The order of the 
fitnesses can never change. Also, genetic algorithms with time-periodic land- 
scapes have been studied by Rowe [31,32]. However, Rowe's approach has the 
disadvantage that it is tightly connected to the discrete time used in genetic 
algorithms, and that the dimension of the transition matrices grows in pro- 
portion to the period length T of the oscillation. This makes it hard to derive 
analytical results, and in addition to that, it renders landscapes with large T 
inaccessible to numerical calculations. 

In this report, we do not cover the large field of molecular evolution in the 
particular landscapes induced by RNA sequence-to-structure mappings [33- 
35], and, in connection to that, we do not discuss neutral networks [36-39]. 
We do so mainly because these topics have so far not been studied in the 
light of temporal variations in the fitness landscapes, and hence, a discussion 
of them does not fit very well into the general tenor of this work. In general, 
it can be argued that neutral networks are of less importance in a dynamic 
environment, because in that situation a population is constantly on the move 
to the next local optimum [40] . 

The remainder of this report is structured as follows. We begin our discus- 
sion in Section 2.2 with a brief summary of the general aspects of dynamic 
fitness landscapes in the quasispecies model. In Section 3, we will develop 
the main subject of this work: a general theory of time-periodic fitness land- 
scapes. The theoretical part thereof is presented in Section 3.1, in which we 
demonstrate how a time-dependent quasispecies can be defined by means of 
the monodromy matrix, and how this monodromy matrix can be expanded 
in terms of the oscillation period T. In Section 3.2, we present an alternative 
approximation formula for the monodromy matrix that is more suitable for 
numerical calculations, and in Section 3.3, we compare, for several example 
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landscapes, the results obtained from that formula with the general theory 
developed in Section 3.1. The restriction of a time-periodic fitness landscape 
is weakened in Section 4, where we discuss the implications of our findings for 
other, non-periodic fitness landscapes. Since our work is based on Eigen's de- 
terministic approach with differential equations, all results presented up to the 
end of Section 4 are only valid for infinite population sizes. In order to address 
this shortcoming, in Section 5 we give a brief introduction into the problems 
involved when dealing with finite populations. In Section 5.1, some simulation 
results are shown, demonstrating the relationship between the results from the 
infinite population limit and the actual finite population dynamics. Finally, 
an approximative analytical description of a finite population evolving on a 
simple periodic landscape is developed in Section 5.2. We close this paper with 
some conclusions in Section 6. 



2 The quasispecies model 

2.1 Static landscapes 

With his model of self-replicating molecules, Eigen showed for the first time 
that Darwin's idea of mutation and selection can work in a simple, seemingly 
"lifeless" system of chemical reactants. His observation that evolution is gov- 
erned by the laws of physics spawned a fruitful field of work, and hundreds of 
papers based on his initial ideas have been written since then. Most of that 
work has been concerned with static fitness landscapes. There exist compre- 
hensive reviews of that work (see [11,13] for a very detailed coverage of the 
literature till 1989, and [19] for a more recent review). Here, we are going to 
briefly introduce the main concepts developed for static landscapes. In doing 
so, we restrict ourselves to those concepts that we will refer to later on in our 
discussion of dynamic fitness landscapes. For a more in-depth discussion of 
static landscapes, the reader is referred to the above mentioned reviews. 

The quasispecies model was originally aimed at describing self-replicating 
DNA or RNA molecules. Therefore, the molecules were conceived of as se- 
quences consisting of a limited number of elementary building blocks. With 
that picture in mind, we may represent the molecules as sequences of let- 
ters. For RNA molecules, e.g., the conventional alphabet consists of the 4 
letters G, A, C, U, representing the 4 bases guanine, adenine, cytosine, and 
uracil, respectively. Today, most researchers are interested in the information- 
theoretic aspects of the model. Consequently, the most common alphabet in 
the molecular evolution literature has become the binary alphabet, consisting 
of the letters and 1. Throughout this work, we will also adopt this choice. 
With regard to the RNA example, the binary alphabet can be considered 
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as distinguishing only between purines (which are guanine and adenine) and 
pyrimidines (which are cytosine and uracil). 

The molecules the model describes have the ability to self- replicate. Self- 
replication is a complicated process, which consumes energy and substrates 
from the environment. These external resources are supposed to be present, 
and are not modeled explicitly. The degree to which a macromolecule finds the 
resources necessary to self-replicate, and is able to exploit them, is expressed 
in the replication coefficients A^. A molecule % that finds good conditions for 
self- replication has a high A iy another molecule j which is a bad self- replicator 
has a much smaller A, . 

The molecules replicate by copying themselves. The copy procedure is gener- 
ally not error- free. Among the different types of errors one can imagine for 
the copy of a sequence of letters (substitutions, insertions, deletions) , we con- 
sider only substitutions. Substitutions can in principle occur with a different 
probability at every single position in the sequence. However, if we assume 
the copy procedure to be performed step by step by some sort of molecular 
machinery, the probability of copying a symbol incorrectly can be expected 
to be the same for all positions in the sequence. Hence, a letter will change 
from to 1 or vice versa during the copy procedure with a fixed probability, 
denoted by R. This probability is called the error rate, or, alternatively, the 
mutation rate. 

Molecules are also subject to decay with a particular rate Di for species %. A 
molecule that decays is assumed to be completely absorbed by the environ- 
ment, i.e., it does not break into parts that are themselves being described by 
the model. 

The constant production of new molecules due to the ongoing self-replication 
will drastically increase the concentration of molecules, and will decrease the 
amount of resources available for further self-replication. We are interested in 
the description of a steady state, and therefore we have to introduce a constant 
dilution which lets new resources enter the system and washes away the excess 
production of those molecules. The total concentration of molecules can thus 
be kept constant by proper adjustment of the dilution flux. 

Finally, we assume that the self-replicating molecules are placed in a well- 
stirred reactor. As a consequence of this assumption, we can neglect any 
spatial correlations in the model, and concentrate solely on the molecules' 
abundances. 

In summary, the model is based on the following postulates: 

(1) The molecules are represented by binary sequences of length /. They form 
and decompose steadily. The number of copies of sequence i present at 
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time t is denoted by rii(t). 

(2) Sequences enter the system solely as the result of a correct or erroneous 
copy of another sequence already present. 

(3) The substrates necessary for the ongoing replication are always present 
in sufficient quantity. Excess molecules are washed away by a flux $(£). 

(4) The decay of sequences is a Poisson process. 

These four assumptions form the basis of Eigen's theory of molecular evolu- 
tion. A quantitative analysis of these assumptions can be done in terms of 
differential equations in the molecules' occupation numbers rij(i). In the fol- 
lowing paragraphs, we recall the quantitative analysis of the model that has 
been developed by Eigen and coworkers. 

We begin by writing down an expression for the change in the number of 
copies of sequence %. The abundance rii(t) of sequence % increases because a 
proportion of the molecules of type % replicates faithfully, while some of the 
other molecules of other types produce offspring of type % as the outcome 
of fruitless attempts to self- replicate. Let the matrix Q = {Qi^j express the 
probability that molecule j copies into molecule i. The associated increase in 
the abundance rij(i) then amounts to J2j AjQijiij(t). 

The decrease of a molecule's abundance can have two reasons: its decay, 
expressed by —DiUiit), and its removal due to the flux term, expressed by 
— rii(t)Q>(t)/N(t). Here, N(t) is the total number of molecules at time t, i.e., 
N(t) = J2j fij(t). Putting all the different terms together, we end up with the 
net change hi(t) in the number of copies of sequence i, 



hi(t) 



-^-iQii &i 



The quantity Qu gives the probability that a molecule % replicates faithfully. 
Qu is sometimes referred to as the copy fidelity. 

For the further development of the theory, it is useful to introduce an average 
excess production E(t), defined by 



E(t) :=5>(f) 

i 

The conservation law 



A: - D, 



N(t) . (2) 



EG« = 1 » ( 3 ) 

i 

expressing the fact that every (possibly erroneous) copy of a molecule rep- 
resents another molecule in the chemistry, allows us to rewrite the average 
excess production in terms of the total production rate and the dilution flux, 
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VIZ. 



E(t) 



N(t) 



(4) 



It proves to be useful to define the matrix W = as 

Wij '.= A jQij D{5ij . 
Now we can rewrite Eq. (I) as 



hi(t) = 



W u - E(t) 



n t (t) + ^2W tJ n J (t)+n z (t)^& 



(5) 



(6) 



Let us introduce concentration variables Xi(t), defined by 

m(t) 



Xi(t) :-- 



N(t) 



(7) 



The quantity Xi{t) measures the relative concentration of molecule i in the 
population. The change in Xi(t) is given by 



N(t) 



N(ty 



(8) 



Hence, if we subtract the rightmost term of Eq. (6) on both sides of Eq. (6), 
and divide by N(t), we arrive at 



Xi(t) 



W u - E(t) 



x i (t) + '£W ij (t)x j (t). 



The total number of molecules grows with 

N(t) = E(t)N(t) - . 



(9) 



(10) 



Typically, one assumes that the flux <3>(t) is adjusted such that N(t) vanishes, 
as expressed by Assumption 3 on page 7. However, this assumption is not 
really necessary for the further development of the theory. Since from this 
point onwards, we will only be concerned with the relative concentrations 
Xi(t), we will disregard the flux altogether, and work with Eq. (9) exclusively. 

At this point, it is useful to introduce vector notation, by lumping the concen- 
trations xi(t),x 2 (t), . . . together into a single vector x(t) = (xi(t),x 2 (t), . . 
Equation (9) then becomes 



x(t) 



W-E(t)l 



x(t), 
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where 1 is the identity matrix. The matrix W can be decomposed into 



W=QA-D, (12) 

if we write the replication and the decay coefficients in matrix form as well. 
Both A and D are diagonal matrices of the form 

A = diag(Ai,A 2 ,...) , (13a) 
D = diag(L> 1 , J D 2 ,...) . (13b) 

The matrix Q is the mutation matrix introduced above. Note that the average 
excess production can also be transformed into vector notation. It takes on 
the form 

E(t) = e* • [A(*)je(*) - D(f)x(*)] , (14) 

where e t is a vector of Is, i.e., e t = (1, . . . , 1). The scalar product between e* 
and a concentration vector [say y(t)] gives exactly the sum over all components 
of that vector. 

Equation (11) is nonlinear, since the term E(t)x(t) is quadratic in x(t). Nev- 
ertheless, a straightforward solution of Eq (11) is possible, because a trans- 
formation exists which removes the nonlinearity. The strength of this trans- 
formation lies in the easy reconstruction of the concentration variables Xi(t) 
from the transformed system. Following Thompson and McBride [2] , or Jones 
et al. [3], we introduce 

y(t) :=exp^E(r)dT^x(t). (15) 

The new variables satisfy the linear equation 

y(t) = Wy(t) , (16) 

which can be shown by insertion of Eq. (15) into Eq. (16). Moreover, since y(t) 
differs from x(t) only by a scalar factor, we can restore the original variables 
via 

W e* •!/(*) V ; 

Note that if all decay constants are equal, i.e., D = diag(Z), ...,£)) with a 
single scalar value D, then the transformation 

y(t) = exp (j[W) + D] dr^j x(t) (18) 

leads to the even simpler equation 

y(t) = QAy(t) . (19) 
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The concentration vector x(t) can again be obtained from Eq. (17). 



The mutation matrix Q has so far remained undefined. The choice of Q spec- 
ifies the relationship between the different molecules in the chemistry. As was 
stated above, we conceive the molecules as bitstrings of length /, and we as- 
sume that copy errors are equally likely on all positions on the string. In that 
case, the mutation matrix can be written down straightforwardly. Suppose 
that two bitstrings differ in d positions. A mutation from one to the other 
occurs only if exactly these d positions are copied erroneously, while all others 
are copied faithfully. Such a copy error appears with probability (1 — R) l ~ d R d . 
Hence, we can write the mutation matrix Q as 



where d(i,j) is the Hamming distance between two sequences. The Hamming 
distance is the number of bits in which two sequences differ. 

The matrix Q is typically very large, since its number of rows and columns 
grows as 2 l with increasing sequence length I. This makes Eq. (16) almost 
intractable, numerically as well as analytically, for all but the shortest se- 
quences. An alternative matrix Q' is often used, in which certain sequences 
are grouped together, such that the number of concentration variables can be 
reduced to I + 1. The main idea of this grouping, developed by Swetina and 
Schuster [7], is to define a single sequence (which should have the highest repli- 
cation coefficient of all sequences) as master sequence, and to group all other 
sequences into error classes, according to their Hamming distance from the 
chosen master sequence. All the sequences with the same Hamming distance 
from the master form a single error class. This procedure has the advantage 
of greatly reducing the number of variables, but it also restricts considerably 
the number of fitness landscapes that can be studied. Since all sequences in an 
error class have to share the same replication coefficient, a fitness landscape, 
for example, in which two peaks have a small or moderately large Hamming 
distance cannot be defined. 

The error class matrix Q' has been given by Nowak and Schuster [14] in a 
relatively simple form: 



Note that in [14] , the indices % and j are interchanged inadvertently. The error 
class matrix can be derived from Eq. (20) by deletion of all but one column 
of every group of columns whose index j yields the same d(0,j), and by the 
subsequent summation of all rows whose index i yields the same d(i, 0). 




(20) 




i+j-2k 



(21) 
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The linearized evolution equation (16) can be solved analytically. The transi- 
tion from an initial state y(0) to the state at time t is given by [41] 

y(t) = exp(Wt)y(0) . (22) 

From that expression, we can read off that the (unnormalized) variables y(t) 
will grow exponentially over time. Mathematically, this growth can be accom- 
panied by either exponentially damped or exponentially amplified oscillations. 
For all cases of biological interest, however, there can be at most exponentially 
damped oscillations. First of all, we will almost always deal with a symmetric 
matrix Q. A symmetric mutation matrix implies that the probability with 
which a sequence % mutates into a sequence j is the same as the probability 
for the inverse process. Rumschitzki has noted that in this case, the spectrum 
of W is real [9] . Namely, we can transform the non-symmetric matrix W into 
a symmetric one by means of a similarity transformation, 

W = QA-D -> A 1/2 WA' 1/2 = A 1/2 QA 1/2 -D. (23) 

The spectrum of the transformed matrix is real because of the matrix's sym- 
metry, and hence the spectrum of the untransformed matrix must be real as 
well. In that case, oscillations will be absent in Eq. (22). For non-symmetric 
Q, we can apply the Frobenius- Perron theorem [42] if the decay rates satisfy 

(D)« < (QA) U for all i. (24) 

The Frobenius-Perron theorem is applicable to a matrix with all positive en- 
tries, and it guarantees a real largest eigenvalue for that matrix. Consequently, 
we have at most exponentially damped oscillations as long as (24) is obeyed. 
In addition to that, the Frobenius-Perron theorem states that the eigenvec- 
tor corresponding to this largest eigenvalue has only strictly positive entries, 
and hence, that this eigenvector can be interpreted as a vector of chemical 
concentrations if normalized appropriately. 

Let us write down a more explicit solution to (16), under the assumption 
that the spectrum of W is known (exact spectra of evolution matrices have 
been derived in [9,43,44]). Let A& be the eigenvalues of W, and let 4> k be the 
associated eigenvectors. Without loss of generality, we order the eigenvalues 
such that A > Ai > A 2 > . . . . The solution to Eq. (16) can then be expressed 
as 

y(t)=J2ak<t> k e Xkt , (25) 
k 

where the coefficients a k have to be chosen such that the initial condition 
y(t — 0) =: y is satisfied. In other words, the give the expansion of the 
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initial condition y in terms of the eigenvectors <j) k , 

Vo = Y, a k ( t>k- (26) 

k 

The principal eigenvector (f) of W has been called the quasispecies by Eigen. 
The reason for this name will become clear shortly. 

In terms of the concentration variables x(t), the solution Eq. (25) becomes 

-w = ^&- (27) 

where e is again a vector with all entries equal to 1. 

As we already mentioned, the Perron-Frobenius theorem assures that the 
largest eigenvalue Ao is non-degenerate, and that all components of the cor- 
responding eigenvector cf> are positive. We factor out the exponential of the 
largest eigenvalue e A °* in the numerator and denominator of Eq. (27) and 
obtain 

x(f) = a,0o + E t >,^^°»' (2g) 

As long as «o 7^ 0, all contributions apart from the one corresponding to the 
largest eigenvalue disappear in the limit t — > oo. Hence, the system evolves to- 
wards a steady state, characterized by the dominant eigenvector of the matrix 
W. 

The case ao = is somewhat artificial, because it implies that the system 
has been started in an exact superposition of eigenstates excluding the domi- 
nant quasispecies. In that case, the system cannot evolve towards <p . Instead, 
it converges towards the eigenvector corresponding to the next-largest eigen- 
value. In a real chemical system with more or less random initial conditions 
and under the presence of noise, the case a = is of no relevance. 

The appearance of a steady state distribution of concentration variables has 
important implications for the understanding of Darwinian evolution. In gen- 
eral, the eigenvectors (f) k are a mixture of several molecular species %. As a 
consequence, a number of species is present simultaneously in the asymptotic 
distribution. This means that selection combined with random mutation does 
not remove all but one molecular species, even if there exists one (the master 
sequence) whose replication coefficient exceeds all others. Instead, the inter- 
play of selection and mutation creates a cloud of mutant species around the 
master sequence. It is this cloud that is termed the quasispecies. The mutant 
distribution forms a unit competing with other similar units, represented by 
the other eigenvectors. Selection acts on these mixtures of sequences, and ulti- 
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1.0 




error rate R 

Fig. 1. The relative concentrations of the master sequence and the error classes for 
sequences of length I = 20 versus the error rate R in a sharply peaked landscape with 
replication coefficients Aq = 10, Ai, . . . , A20 = 1. The concentrations of all sequences 
with the same distance d to the master are summed up and displayed as a single 
line. The decay constants have been set to Dj = D, with arbitrary D. In static 
fitness landscapes, the decay constants drop out of the asymptotic quasispecies if 
they are all equal. 

mately singles out the dominant one, i.e., the one corresponding to the largest 
eigenvalue. 

Interestingly, the master sequence does not necessarily occupy a large fraction 
of the quasispecies. In fact, as the mutation rate increases, it is common that 
the master dwindles while the one-mutant and the two-mutant sequences form 
the prevailing part of the quasispecies. Figure 1 shows the asymptotic sequence 
distribution versus the error rate R for a sharply peaked landscape in which 
all but one of the replication coefficients are identical, while the remaining one 
exceeds them significantly. The mutants are grouped into error classes. If copy 
errors are not present, i.e., for R — 0, the master sequence is the only species 
present, and all others have a vanishing concentration. As soon as R takes 
on values slightly above 0, the concentration of the master sequence starts to 
decrease. At the same time, the concentrations of the mutants begin to grow, 
first that of the one-mutant sequence, than that of the two-mutant sequence, 
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and so on. At i? = 0.11 we observe a sharp transition, beyond which the con- 
centrations take on constant values. The mutation rate at which this transition 
occurs is called the error threshold. Beyond the error threshold, the outcome 
of a replication event can be considered a random sequence, and therefore, all 
molecular species take on the same concentration in this regime. The fact that 
all concentrations are equal beyond the error threshold is somewhat blurred 
in Fig. 1 because of the use of error classes. In fact, the heights of the different 
curves merely reflect the sizes of the corresponding error classes. For R — |, 
the sequences replicate stochastically in any landscape, since faithfully copied 
and erroneously copied symbols are then equally likely. The outcome of a sin- 
gle copy process is therefore a random sequence in that case. When R comes 
close to the value 1, almost every symbol is copied incorrectly. This implies 
that the copy is, apart from mutations, the inverse of the original sequence 
(note that this is a peculiarity of binary strings). As a consequence, order 
can again be seen for large R. In Fig. 1, the inverse error threshold occurs 
at R = 0.94. Beyond that point, the inverse master sequence dominates the 
quasispecies. 

If an evolving system displays an error threshold, the amount of informa- 
tion that can be acquired by the system is severely limited. In the case of 
the sharply peaked landscape, the critical mutation rate at which the error 
threshold occurs decreases as 1 — (l/er) 1 /' with increasing string length [45]. 
The quantity o in that expression stands for the relative advantage of the 
master sequence over the other sequences. For large /, the critical mutation 
rate is therefore very close to 0. This implies that for a given mutation rate, 
the molecules can grow to a certain length, while beyond that length, all in- 
formation is lost. Incidentally, it follows that the spontaneous formation of 
complex self- replicating molecules seems to be impossible. Eigen and Schuster 
attempted to solve this problem with the concept of the hypercycle [6]. How- 
ever, it must be said that not all landscapes do present an error threshold. 
In particular, multiplicative landscapes [46-48] show a smooth crossover from 
the completely ordered situation at R — 0, where the master sequence is the 
only sequence present, to R = |, where all molecules take on the same concen- 
tration. Moreover, the decay of the master sequence's concentration depends 
only on the mutation rate and the selective advantage, but not on the length 
of the strings. Hence, in a multiplicative model, sequences can grow in length 
without bound, and self-replicating molecules might form spontaneously. 

The error threshold can be viewed as the critical point of a phase transi- 
tion. This fact is well established through a rich body of work, in which 
the quasispecies model and related models have been mapped onto spin lat- 
tices [49,10,50], spin chains [51,48], and more recently also onto the statistical 
mechanics of directed polymers in a random medium [44] . The order parame- 
ter that is generally being used to describe that phase transition is the average 
overlap of the sequences in the population with the master sequence, given by 



14 



the expression I — 2d(0,i), where d(0,i) is the Hamming distance between a 
sequence of type % and the master sequence. The overlap takes on the value 
/ if we compare the master sequence with itself, the value — / if we compare 
it with its inverse, and intermediate values for all other sequences. The order 
parameter reads 

m s = jY,Xi[l-2d(0,i)]. (29) 

We use the symbol m s because the order parameter represents the surface mag- 
netization when we map the quasispecies model onto an Ising lattice [10,50]. 

2.2 Time- dependent replication rates 

Having developed a good understanding of the concepts of molecular evolution 
theory in static environments, let us move on to the non-static case. The basic 
quasispecies equation (11) changes only in so far as the matrix W now becomes 
time dependent, 

x(t) = [W(t) -E(t)l]x(t) . (30) 

In the most general case, the time dependency may stem from the replication 
rates, from the decay rates, or even from the mutation matrix: 

W(t) = Q(t)A(t) - D(t) . (31) 

The average excess production is then 

E(t) = e l ■ [A(t)x(t) - T>(t)x(t)] . (32) 

As in the static case, we can transform away the nonlinearity in Eq. (30) with 
the introduction of unnormalized variables 

y(t)=exp^E(r)dr^x(t). (33) 

The resulting equation is again a linear differential equation, however, this 
time with non-constant coefficients: 

y(t) = W(t)y(t) . (34) 

As in the static case, if all decay constants are equal, i.e., D(£) = diag(-D(t), . . . , D(t)) 
with a single scalar function D(t), then an extended transformation similar to 
Eq. (18) removes the decay constants from Eq. (34). 

In the previous subsection, we were able to immediately write down a solu- 
tion for the linearized quasispecies equation. In the case of a dynamic fitness 
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landscape, however, no such a simple closed form solution exists. Instead, we 
have to be satisfied with partial solutions, approximations, and limiting cases. 
In particular, we cannot generally define a quasispecies as the steady state 
the system approaches for t — > oo. Therefore, a central question in relation to 
dynamic landscapes is the appropriate definition of a quasispecies concept. 

For now, we start collecting information about Eq. (34). First of all, we note 
that we can map the quasispecies model onto a linear system with a symmetric 
matrix W(t) if Q(t) is symmetric for all t. This can be seen by introducing 

z(t) = A^(t)y(t) . (35) 

Differentiation yields 

z(t) = W(t)z(t) (36) 

with 

W(t) = A 1 / 2 (t)Q(t)A 1 / 2 (t) - D(t) + 

Unfortunately, we cannot write down a solution for Eq. (36) from the knowl- 
edge of the eigensystem of W(£) if W(£) has an arbitrary time dependency. 
However, the above mapping shows that for symmetric Q(t), we may safely 
assume that the matrix W(£) has only real eigenvalues. 

There are two limiting cases that we can discuss without specifying a land- 
scape, namely very fast changes in W(t) on the one hand, and very slow 
changes in W(t) on the other hand. We begin with the case of very slow 
changes. For the rest of this work, we will assume that W(i) has a real spec- 
trum for all t. To be on the safe side, we also assume that (24) is satisfied for 
all t. In that way, the Perron eigenvector of W(t) can always be interpreted 
as a vector of chemical concentrations. 



s Al/2 ( ( » 



A- 1/2 (t) . (37) 



For every time to, we can define a relaxation time 

(38) 

where Ao(£)o and Ai(to) are the largest and the second largest eigenvalue of 
W(io), respectively. The time tr(£o) gives an estimate of how long a linear 
system with matrix W(£ ) needs to settle into equilibrium. Therefore, if the 
changes in W(t) happen on a timescale much longer than 7r(£), the system 
is virtually in equilibrium at any given point in time. Hence, for large enough 
t, the quasispecies will be given by the Perron eigenvector of W(i). Strictly 
speaking, this is only true if there is always some overlap between the largest 
eigenvector of W(t) and the one of W(i + dt), but in all but some very patho- 
logical cases we can assume this to be the case. 
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The situation of fast changes in W(i) is somewhat more difficult, because, as 
we are going to see later on, we have to define a suitable average over W(£) 
in order to make a general statement. Therefore, we postpone this case for a 
moment. A detailed discussion of fast changes will be given for the particular 
case of periodic fitness landscapes in the next section, and later on, we will 
discuss fast changing landscapes in general. 



3 Periodic fitness landscapes 

3. 1 Differential equation formalism 

In this section, we study periodic time dependencies in W(t), for which we 
can prove several general statements. 

If the changes in W(£) are periodic, i.e., if there exists a T such that 

W(t + T) = W(t) for all t, (39) 

then Eq. (34) turns into a system of linear differential equations with periodic 
coefficients. Several theorems are known for such systems [52]. Most notably, 
if Y(t, t ) is the fundamental matrix, such that every solution to Eq. (34) can 
be written in the form 

y(t)=Y(t,t )y(t ), (40) 

then we can define the monodromy matrix X(t ), 

X(t )=Y(t + T,to), (41) 

which simplifies Eq. (40) to 

y(t)=Y(t + <P,t )X m (t )y(t ) 

= X m (t + 0)Y(f o + 0, t )y(t ) , (42) 

for the decomposition t = mT + <fi + to with the phase < T. In particular, 
we have 

? /(0 + mT) = X m (0) ? /(0), (43) 

so that for every phase 0, we have a well defined asymptotic solution, given by 
the eigenvector associated with the largest eigenvalue of X(0). Hence, periodic 
fitness landscapes allow the definition of a quasispecies, much in the same 
way as static fitness landscapes do. However, here the quasispecies is time- 
dependent. In other words, a system that evolves in a periodic fitness landscape 
runs for t — > oo into a limit cycle with period length T. 
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3.1.1 Neumann series for X 



With the knowledge of the monodromy matrix X, the calculation of the sys- 
tem's limit cycle becomes a simple eigenvalue problem. The monodromy ma- 
trix, however, can in general not be given in a closed from. Consequently, we 
have to rely on expansions in various parameters. In this section, we derive 
a formal expansion in T for the monodromy matrix. This formal expansion 
is similar in spirit to the Neumann series which gives a formal solution to an 
integral equation, and it is based on the Picard-Lindelof iteration for differen- 
tial equations. As the first step, we have to rewrite Eq. (34) in the form of an 
integral equation, i.e., 

y(t + r)= y(t ) + [ T W(t + n)y(to + n)dn . (44) 
Jo 

Our goal is to solve this equation for y(t + r) by iteration. Our initial solution 
is 

y (t + t) = y(t ) , (45) 

which we insert into Eq. (44). As a result, we obtain the 1st order approxima- 
tion 

!/i(*o + r) = y(t ) + f W(t + rjy^dn . (46) 
Jo 

Further iteration yields 

y 2 (to + r) = y(t ) + f W(t<, + 7i)j/(f )tZTi 
Jo 

+ f W(i + n) f 1 W(* + r 2 )y(t )rfr 1 rfr 2 , (47) 

JO JO 

and so on. Now we define 



W (t ,T) = l, (48) 
W 1 (t ,r) = - rW^o + rOdn, (49) 

T Jo 

and, in general 

W fc (f , T)4f W(t + n) r W(t + r 2 ) • • • r 1 W(t + r fc )dridr 2 • • • dr fc , 
r K Jo jo jo 

(50) 

and obtain the formal solution 

oo 

y(to + t) = Y, r k W k (t , r)y(t ) . (51) 

k=0 
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For suitably small r, the infinite sum on the right-hand side is guaranteed to 
converge. When we compare this equation for r = T to the definition of the 
monodromy matrix Eq. (41), we find that [introducing W fc (t ) : = Wfc(to,T)] 

oo 

X(t ) = Y,T k W k (t ). (52) 

k=0 

In particular, since Wi(t ) is identical to the time-average over W(t), regard- 
less of to, we have the high-frequency expansion 

X(t ) = 1 + TW+0(T 2 ), (53) 

with 

W = 1 f T W(t) dt . (54) 



T Jo 

Equation (53) reveals that for very high frequency oscillations, the system 
behaves as if it was subject to a static landscape. That static landscape is 
given by the dynamic landscape's average over one oscillation period. 

The radius of convergence of the expansion Eq. (52) can be estimated as 
follows. Since all entries of W(t) are positive, the tensor 

W ivi W vlV2 ■ ■■W, k _ lj (t ) := jT W ivi (t + ri) £ W vlU2 (t + r 2 ) • • • 

/ w v k -ij (*o + T k )dT X dr 2 ■■■dr k (55) 

J 

can be bound by 

W iUl W VlU2 ■ ■ ■ W Vk _ d (t ) <Yk£ W ivi (t + r)dr £ W ulV2 (t + r)dr ■ ■ ■ 

f W Vk _ d {t + r)dr , (56) 

J 

from which it follows that 

(W t(t0 )) i . < (W 1 ).. . (57) 
The matrix norm induced by the sum norm 

||(S/l,3/2,...,J/n)||i = JZ|j/i| ( 58 ) 



is the column-sum norm 



W 



^^{ew} • ( 59 ) 
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With that norm, and using Eq. (57), we can estimate 

W fc (to) < W fe < W 



(60) 



Hence, the expansion Eq. (52) converges certainly for those T that satisfy 

T 



W 



< 1. 



(61) 



Since all entries in W are positive, we have further 



w 


= max < 




1 3 \ 



max {A,- — Djj , 



(62) 



where the bar in Aj and Dj indicates that these quantities are averaged over 
one oscillation period. The second equality holds because of (24) and because 
of J2i Qij — 1- Without loss of generality, we assume that the maximum is 
given by A — D . Then, Eq. (61) is satisfied for 



T < 



A n ~Dr 



(63) 



It is interesting to compare this expression to the relaxation time of the time- 
averaged fitness landscape, Tr. To 0th order, the principal eigenvalue of W is 
given by W^oo- The second largest eigenvalue is to the same order given by the 
second largest diagonal element of W, which we assume to be Wu without 
loss of generality. Hence, the relaxation time is approximately given by 



1 1 1 

TR = Woo- Wu > Wo~ A -D ' 



(64) 



which is in general larger than the radius of convergence of Eq. (52). In partic- 
ular, if the largest and the second largest eigenvalue of W lie close together, 
the relaxation time may be much larger than the largest oscillation period 
for which the expansion is feasible. This restricts the usefulness of Eq. (52) 
to substantially high frequency oscillations in the landscape. The interesting 
regime in which the changes in the landscape happen on a time scale compa- 
rable to the relaxation time of the system can unfortunately not be studied 
from Eq. (52). 



3.1.2 Exact solutions for R = and R = 0.5 

The two extreme cases R = (no replication errors) and R = 0.5 (random 
offspring sequences) allow for an exact analytic treatment. The second case 
is identical to the case of static landscapes, and therefore we will mention it 
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only briefly. At the point of stochastic replication R = 0.5, the population 
dynamics becomes independent of the details of the landscape. As a conse- 
quence, temporal changes in the landscape must become less important as R 
approaches R = 0.5. However, this is not very surprising, since in most cases, 
an error rate close to 0.5 implies that the population has already passed the 
error threshold, which in turn implies that it does not feel the changes in the 
landscape any more. 

The case of R = 0, on the other hand, is more complex than the corresponding 
case in a static landscape. Since the matrix Q becomes the identity matrix for 
R = 0, Eq. (34) reduces to 

y(t) = [A(t) - B(t)]y(t) . (65) 

The matrices A(t) and D(t) are diagonal by definition, and hence, a solution 
to Eq. (65) is given by 

y(t) = exp (/W') - D(0]d*') !/(*o) • (66) 

When we compare this expression to Eqs. (40) and (41), we find 

Y(t, t ) = exp (£[A(t') - D(0]<ft') , (67) 

and, in particular, 

X(0) = exp ^ +T [A(t') - T>(t')}dt^J . (68) 

The integral in the second expression is taken over a complete oscillation 
period, and hence, it is independent of 0. Thus, we find for arbitrary 

X(0) = exp(W) for R = Q. (69) 

With a vanishing error rate, the monodromy matrix becomes the exponential 
of the time-average over W(£). Since the exponential function only affects the 
eigenvalues of a matrix, the quasispecies is given by the principal eigenvector 
of W, irrespective of the length of the oscillation period T. In other words, in 
the absence of mutations will the sequence % with the highest average value of 
Ai(t) — Di(t) take over the whole population after a suitable amount of time, 
provided it existed already in the population at the beginning of the process. 
By continuity, this property must extend to very small but positive error rates 
R. So, similar to the case of R — 0.5, the temporal changes in the landscape 
loose their importance when R approaches 0. 

There is, however, a caveat to the above argument. In case the largest eigen- 
value of W is degenerate, temporal changes in the landscape may continue to 
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be of importance for R = 0. A degeneracy of the largest eigenvalue of W is 
possible, because the Frobenius-Perron theorem applies only to positive error 
rates. For degenerate quasispecies, the initial condition y(t ) determines the 
composition of the asymptotic population. In this context, let us consider the 
general solution for periodic fitness landscapes, Eq. (42). We have 

?/(t) = X m (0)?/(to + 0) (70) 

with 

y(t + (f>) = Y(to + <Mo)!/(t ). ( 71 ) 

So even if X becomes independent of for R = 0, this need not be the case 
for y(t + 0), because of Eq. (71). If the largest eigenvalue of W is degenerate, 
variations in y(t + 0) will remain visible for arbitrarily large times t. Hence, 
we will observe oscillations among the different quasispecies which correspond 
to the largest eigenvalue. Clearly, this effect is the more pronounced the larger 
the oscillation period T. 



3.1.3 Schematic phase diagrams 

The results of the previous two subsections allow us to identify the general 
properties of the quasispecies model with a periodic fitness landscape at the 
borders of the parameter space. As parameters, we shall only consider error 
rate R and oscillation period T, since all other parameters (replication rates, 
decay rates, details of the matrix Q) do not influence the above results. In 
Fig. 2, we have summarized our findings. Along the abscissa runs the oscilla- 
tion period. For very fast oscillations, the evolving population sees only the 
time-averaged landscape. For very slow oscillations, on the other hand, the 
population is able to settle into an equilibrium much faster than the changes 
in the landscape occur. Hence, the population sees a quasistatic landscape. 
Along the ordinate, we have displayed the error rate. We discarded the re- 
gion above R = 0.5, in which anti-correlations between parent and offspring 
sequences are present, as it is redundant. For R = 0.5, all sequences have ran- 
dom offspring, and hence, all sequences replicate equally well. Therefore, for 
this error rate, the landscape becomes effectively flat. On the other side, for 
R = 0, we have again the time-averaged landscape. However, for large T, the 
fact that we see the average landscape does not mean that the concentration 
variables are asymptotically constant. Degeneracies in the largest eigenvalue 
may cause a remaining time dependency due to oscillations between super- 
imposed quasispecies. The exact form of these oscillations is dependent on 
the initial condition 2/(0). For small T, the oscillations disappear, because the 
ratio of newly created sequences during one oscillation period and remaining 
sequences from the previous oscillation period decays with T [Eq. (53)]. 
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Fig. 2. The appearance of a periodic fitness landscape at the border regions of the 
parameter space. 

From the above observations, we can derive generic phase diagrams for periodic 
fitness landscapes. There are two main possibilities. The fitness landscape may 
average to a landscape that has a distinct quasispecies, or it may average to a 
flat landscape. These two cases are illustrated in Fig. 3. Note that the diagrams 
are meant to illustrate the qualitative form and position of the different phases. 
In their exact appearance, they may differ substantially from the exact phase 
diagram of a particular landscape. 

If an averaged landscape sports a distinct quasispecies, then for every os- 
cillation period T and every phase of the oscillation 0, we have a unique 
error threshold R*(T,(f>). For small T, the error threshold converges towards 
that of the average fitness landscape, _R* V , irrespective of the phase 0. For 
larger T, the error threshold oscillates between R* Q = min^ R*(T, 0) and 
= max^, R* (T, 0) . In the limit of an infinitely large oscillation period, 
Rfo converges towards R* nax , which is the largest error threshold of all the 
(static) landscapes W(0). Similarly, R* Q converges towards -R^in i n that limit, 
where i?* lin is accordingly defined as the smallest error threshold of all land- 
scapes W(0). For a fixed oscillation period T and a fixed error rate R with 
R* a < R < it£i , we have necessarily R > R*(T,cf>) for some phases 0, and 
R < R*(T,(f)) for the rest of the oscillation period. As a result, a quasis- 
pecies will form whenever R > R*(T, 0), but it will disappear again as soon as 
R < R*(T,4>). This phenomenon has for the first time been observed in [17], 
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where the region of the parameter space in which it can be found has been 
called the temporarily ordered phase. In this phase, whether we observe order 
or disorder depends on the particular moment in time at which we study the 
system. Correspondingly, we will call a phase "ordered" only if order can be 
seen for the whole oscillation period, and we will call a phase "disordered" 
if during the whole oscillation period no order can be seen. The relationship 
between the ordered phase, the disordered phase, and the temporarily ordered 
phase for the first type of landscapes is displayed in Fig. 3a. Compare also the 
phase diagram of the oscillating Swetina-Schuster landscape in Fig. 4. 

In a landscape that averages to a flat one, on the other hand, the disordered 
phase must extend over the whole range of R for sufficiently small T, and 
order can be observed only above a certain T min . The behavior of the system 
for small R above T min cannot in general be predicted solely from the knowl- 
edge of Fig. 2. For a landscape with a flat average, the eigenvalues of the 
monodromy matrix are degenerate for R = 0. Therefore, in the limit R — > 0, 
the Perron eigenvector can, in principle, converge to any superposition of the 
eigenvectors of X(i? = 0). This situation is visualized in Fig. 3b. If the limit 
corresponds to a non-homogeneous sequence distribution, the ordered phase 
extends to arbitrarily small R [indicated by the solid line in Fig. 3b]. If, on 
the other hand, the limit would correspond to a homogeneous sequence distri- 
bution, we might find a lower error threshold below which the system would 
again be in disorder [this is indicated by the dashed line in Fig. 3b]. Since 
for longer oscillation periods, the oscillations in the degenerate quasispecies at 
R = become important, order would be observed for much smaller R with 
increasing T. Hence, the lower disordered phase would fade out for T — > oo. 
A study investigating under which situations the ordered phase extends to 
arbitrarily small R will be presented elsewhere. As a preliminary result, we 
can state that the limit R — > does in general not lead to a homogeneous 
sequence distribution. 



3.2 Discrete approximation 



The differential equation formalism we have used so far allows for an elegant 
discussion of the system's general properties. However, if we want to obtain 
numerical solutions, this formalism is not very helpful, because we do not have 
a general expression for the fundamental matrix Y(t,to) from Eq. (40), nor 
for the monodromy matrix X(t ) from Eq. (41). Therefore, for a numerical 
treatment we need to move over to the discretized quasispecies equation, 

y(t + At) = [AtW(t) + l]y(t). (72) 
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Fig. 3. Two possible phase diagrams of a periodic landscape. If W(t) averages to a 
non-flat landscape, there will typically be a lower error threshold, below which we 
always find order, and a higher error threshold, above which the system is always 
in a disordered state. If W(t) averages to a flat landscape, however, the disordered 
phase extends to the whole range of R for sufficiently small T. The dashed line 
inside the ordered regime in case b) is explained in the text. 

In the case of constant W, the quasispecies obtained from this equation is 
identical to that of Eq. (34), and it is also identical to that of equation 

l/(f + l)=Wy(t). (73) 

Equation (73) has been studied by Demetrius et al. [53], and has been em- 
ployed by Leuthausser [49,10] for her mapping of the quasispecies model onto 
the Ising model. In the general time-dependent case, however, the additional 
factor At and the identity matrix 1 of Eq. (72) are important, and cannot be 
left out. The analogue of the fundamental matrix for Eq. (72) reads 

Y(t + kAt, t ) = T { [] [AtW(* + v&t) + 1] } , (74) 

where T{-} indicates that the matrix product has to be evaluated with the 
proper time ordering [26]. Similarly, the analogue of the monodromy matrix 
becomes 

X(t )=Y(t + T,t ) 

= T | J] [AtW(t + uAt) + 1] | , (75) 

where we have assumed that T is an integral multiple of At, and where n = 
T/ At. The influence of the size of At on the quality of the approximation 
has been investigated in [26]. A more in-depths discussion of the relationship 
between the continuous and the discrete quasispecies model can also be found 
in [19]. 
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3.3 Example landscapes 



For the rest of this section, we are going to study several example landscapes, 
in order to illustrate the implications of our general theory. In all cases con- 
sidered, we represent the molecules as bitstrings of fixed length /. Moreover, 
we assume that a single bit is copied erroneously with rate R, irrespective of 
the bit's type and of its position in the string. 

3.3.1 One oscillating peak 

In previous work on the quasispecies model with periodic fitness landscapes [26,17], 
most emphasis has been laid on landscapes with a single oscillating sharp peak. 
As a generalization of the work of Swetina and Schuster [7], the master se- 
quence has been given a replication rate Ao(t) ^> A, where A is the replication 
rate of all other sequences. The replication rate A (t) has been expressed as 



with a T-periodic function /(£). The generic example for that function is 
fit) = sin(u;t), leading to 



The parameter e allows a smooth crossover from a static landscape to one 
with considerable dynamics, and the exponential assures that A (t) is always 
positive. 

In [26,17] it has been found that the behavior at the border regions of the 
parameter space is indeed as it is depicted in Fig. 2, and that a phase diagram 
of the form of Fig. 3a correctly describes the relationship of order and disorder 
in an oscillating Swetina-Schuster landscape. Here, we put less emphasis on the 
numerical simulations that lead to these conclusions, but instead show that the 
phase borders in such a phase diagram can, for an oscillating Swetina-Schuster 
landscape, be calculated approximately. 

For static landscapes with a single peak, the assumption of a vanishing mu- 
tational backflow into the master sequence allows to derive an approximate 
expression for the error threshold [45,11,13]. A similar formula can be devel- 
oped to calculate the error threshold as a function of time in a landscape with 
a single oscillating peak. But before we turn towards the dynamic landscape, 
we shall rederive the expression for the master sequence's concentration xq in 
a static landscape, based on neglecting mutational backflow. The expression 
we shall find is slightly more general than the one that was previously given, 
and it will be of use for the periodic fitness landscape as well. 



Ao(t) = 4>,stat exp[e/(t)] , 



(76) 



A (t) = A , stat exp[esin(a;t)] , 



(77) 
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The Oth component of the quasispecies equation (30) becomes, after neglecting 
the mutational backflow, 



x (t) = W 00 x (t) - E(t)x (t) . 



(78) 



The average excess production E(t) can be expressed in terms of x(t) and W 
as 



E{t) = Y J W lJ x 3 {t). 



(79) 



With that expression, the solution of Eq. (78) requires the knowledge of the 
stationary mutant concentrations Xj, which are usually unknown. To circum- 
vent this problem, we make the somewhat extreme assumption that all mutant 
concentrations are equal. Although this assumption, which is equivalent to the 
assumption of equal excess productions E^ in the usual calculation without mu- 
tational backflow, will generally not be true, it works fine for Swetina-Schuster 
type landscapes. With this additional assumption, Eq. (79) becomes 



E(t) = £ 



j>0 iy 1 



(80) 



where N is the number of different sequences in the system. When we insert 
this into Eq. (78) and solve for the steady state, we find 



x = 



Woo — iv^T ^j>o Wij 



(81) 



The expressions involving sums over matrix elements in Eq. (81) can be iden- 
tified with the excess production of the master, 



iO 



(82) 



and with the average excess production without the master, 

1 



E_o = 



N 



(83) 



i j>0 



if W has the standard form QA — D. Therefore, Eq. (81) corresponds to the 
often quoted result 



Won -E 



x = 



oo ~ 
Eo — E-o 



(84) 



However, Eq. (81) is more general in that it can be used even if W is not given 
as QA - D. 
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The idea here is to insert the monodromy matrix into Eq. (81) in order to 
obtain an approximation for xq in the case of periodic landscapes. But under 
what circumstances can we expect this to work? After all, Eq. (81) has been 
derived from an equation with continuous time, Eq. (78), whereas the mon- 
odromy matrix advances the system in discrete time steps, as can bee seen 
in Eq. (43). The important point is here that we are only interested in the 
asymptotic state, which is given by the normalized Perron vector of the mon- 
odromy matrix, whether we use discrete or continuous time. Therefore, we are 
free to calculate the asymptotic state in a periodic landscape for a given phase 
(f) from 

y(t) = X(4>)y(t) , (85) 

even if this equation does not have a direct physical meaning for finite times. 
The asymptotic molecular concentrations are then given by the limit t — > oo 
of 

x(t) = . (86) 

w e-y(t) 1 ; 

From differentiating Eq. (86) and inserting Eq. (85), we obtain 

x(t) = X((f>)x(t) - x(t) (e • [X(0)x(t)]) . (87) 

When we neglect the backflow onto the master sequence, the Oth component 
of that equation becomes identical to Eqs. (78) and (79), but with the matrix 
X(0) instead of W. This shows that we may indeed use Eq. (81) as an ap- 
proximation for the asymptotic concentration of x . Of course, since we have 
neglected mutational backflow, this approximation works only for landscapes 
in which a single sequence has a significant advantage over all others. But this 
restriction does equally apply to the static case. Numerically, we have found 
that Eq. (81) works well for a single oscillating peak, and that it breaks down 
in other cases as expected. 

With the aid of Eq. (81), we are now in the position to calculate the phase 
diagram of the oscillating Swetina-Schuster landscape. When we insert the 
monodromy matrix X(0) into Eq. (81), we are able to obtain (numerically) 
the error rate at which x vanishes, R*(T,<f>). From that expression, we can 
calculate R* Q and The results of the corresponding, numerically extensive 
calculations are shown in Fig. 4, together with i?* v , -R^ ax , and R^ in , which 
have also been determined from Eq. (81). 

We find that both R^ Q and R^ approach R* av for T — > 0, as predicted by our 
general theory. For T — ^ oo, R^ grows quickly to the level of -R^ax; but a slight 
discrepancy between the two values remains. This is due to the complexity of 
the numerical calculations involved for large T. We can only approximate 
the monodromy matrix by means of Eq. (75), and we need ever more factors 
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Fig. 4. The phase diagram of an oscillating Swetina-Schuster landscape 
[Aq (t) = e 2 - 4 exp(2sincj£)], obtained numerically from Eq. (81). 

AtW(t + uAt) + 1 for large T. The discrepancy between R* Q and R^ hl , on 
the other hand, has a different origin. The main cause here is the fact that the 
relaxation into equilibrium is generally slower for smaller error rates. There- 
fore, R* Q needs a much larger T to reach -Rj^in than it is the case with R^ and 

Very recently, Nilsson and Snoad [24] have demonstrated that the method 
of neglecting back mutations described above can be used to calculate an 
approximate analytic solution for the oscillating peak. They exploit the fact 
that if all replication rates except A (t) are equal to 1, Eq. (78) becomes 

x (t) = A (t)x (t) - E(t)x (t) , (88) 

with 

E(t) = [A (t) - l]x (t) + 1 (89) 

and Q = (1 — R) 1 . Eq. (88) can then be solved exactly by introducing a 
nonlinear transformation 

x {t) - y(t) = n g "? ( l , (90) 

[1 - Q\xo{t) 



which linearizes the equation. For large t, Nilsson and Snoad obtain 



Xo(t) = Q 



1 + (1 



-Q) f e -Sl^Mu)-i]du ds 
Jo 



(91) 
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Interestingly, this expression holds for arbitrary A (t), and not only for peri- 
odic ones. 

From our discussion in Sec. 3.1, we know that the master sequence's concen- 
tration will, for fast changes, settle to the value corresponding to the average 
replication rate, and for very slow changes it will be virtually in equilibrium 
with the current replication rate. In the intermediate regime, we expect the 
concentration to lag somewhat behind the replication rate. This behavior can 
be seen as a low-pass filtering of the environmental changes, performed by the 
evolving population. The idea to look at time-dependent fitness landscapes 
from that perspective was first introduced by Hirst [27,28], who reevaluated 
similar observations made in population genetic models (without explicit land- 
scape) [54-57]. For the quasispecies model, phase shift and amplitude of the 
response to a sinusoidal replication rate have been determined computationally 
in [26], with the result that these curves do indeed have the appropriate form 
for a low pass filter. Based on Eq. (91), an analytic expression for amplitude 
and phase shift has been given in [24]. 



3.3.2 Validity of the expansion in T 

In the previous section, we have calculated the phase diagram in a landscape 
with a single oscillating peak. In this section, we are going to asses the validity 
of the monodromy expansion in T in a similar landscape. Since the integrals 
involved become very unpleasant if we choose the master sequence's replication 
rate to be proportional to exp[sin(u;£)], we study here the related landscape 



This landscape follows from expanding Eq. (77) to first order in e. Eq. (50) can 
be evaluated analytically to arbitrary order for that landscape. In Appendix A, 
we have carried out the corresponding calculations to 2nd order in T. 

In Fig. 5, we display the order parameter m s obtained from the expansion 
of X in terms of T and from the discrete approximation of X as a function 
of the phase for four different oscillation periods T. The results from the 
two different methods agree well for T < 0.1, but disagree for larger T. The 
disagreement results from a breakdown of the expansion in T when T be- 
comes large. Note that this breakdown occurs in the vicinity of the radius of 
convergence that we have estimated in Eq. (63). For the particular landscape 
of Fig. 5, the estimate guarantees convergence for oscillation periods below 
approx. 0.1. 



A (t) = A), st at[l + esm(ujt)] , 
Ai(t) = 1 for i > 0. 



(92a) 
(92b) 
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Fig. 5. Order parameter m s for a landscape with a single oscillating peak, Eq. (92a). 
The error rate is R = 0.06. Solid lines stem from the discrete approximation with 
n = 100, the dotted lines stem from the expansion in terms of T, Eq. (52), evaluated 
up to second order. Clearly, the expansion Eq. (52) is only of use for relatively short 
oscillation periods. 

3.3.3 Two oscillating peaks 

A single oscillating peak provides some initial insights into dynamic fitness 
landscapes. It is more interesting, however, to study situations in which sev- 
eral sequences obtain the highest replication rate in different phases of the 
oscillation period. The simplest such case is a landscape in which two se- 
quences in turn become the master sequence. Here, we will assume that the 
two are located at opposite corners of the boolean hypercube, i.e., that they 
are given by a sequence and its inverse. In that way, it is possible to group 
sequences into error classes according to their Hamming distance to one of 
the two possible master sequences. As an example, we are going to study a 
landscape with the replication coefficients 

A (t) = A ,stat exp (e sin out) , (93a) 
A t (t) = A , st at exp(-e sin tot) , (93b) 
Ai(t) = 1 for < % < I. (93c) 

The subscripts in the replication coefficients stand for the Hamming distance 
to the sequence 000 • • • 0. 

For single peak landscapes, it is instructive to characterize the state of the 
system at time t by the value of the order parameter m s (t) [Eq. (29)]. In 
principle, m s (t) can also be used for a landscape with two alternating master 
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sequences if they are each other's inverse. In that case, the Hamming distance 
has to be measured with respect to one of the two master sequences. If the 
population consists only of sequences of the other type of master sequence, 
we have m s {t) = —1. However, there is a small problem with degenerate 
landscapes in which the two peaks have the same replication rate. In such 
landscapes, the sequence distribution becomes symmetric with respect to the 
two peaks, i.e., xq — xi, x\ = xi-i, and so on. Then, m s (t) becomes zero 
because of this symmetry, although the population may be in an ordered state. 
To distinguish between the case of true disorder and the case of an ordered, 
but symmetrical population, we introduce the additional order parameters 

i L(i-i)/2J 

mf(t) = - £ xm-n, (94) 

1 i=0 

and 

™r(*) = y E Xi(t)[i-2%[. (95) 

1 i=i-L(J-l)/2j 

Here, |_^J stands for the largest integer smaller than or equal to x. 

The quantity mf(t) is always positive, mj(t) is always negative, and further- 
more, we have 

m s (t) = m+ (t) + m; (t) . (96) 

If the population is uniformly distributed over the whole sequence space, we 
have 

1 L('-i)/2j /A 

mt(t) = -m;(t) = - £ (JC - 2< )- (97) 

This expression tends to for I — > oo. If, on the other hand, only the two 
peaks are populated, each with half of the total population, we find 

mt(t) = -m-(t)= 1 -. (98) 

In the case that either m£(t) or mj(t) vanish, the population is centered about 
the respective other peak. 

In the following, when it is important to distinguish between true disordered 
populations and symmetric populations, we will use mf(t) and mj(t). When 
the situation is non-ambiguous, we will use m s (t) alone, in order to improve 
the clarity of our figures. 

In Fig. 6, we have displayed mf(t), m~(t) and m s (t) for the quasispecies in 
a fitness landscape of the type defined in Eq. (93). For a large oscillation 
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Fig. 6. Order parameters m s (t), mf(t), m~(t) as a function of the oscillation phase 
4> = (t mod T)/T in a landscape with two alternating peaks. The upper dashed 
line represents mf(t), the lower dashed line represents m~(t), and the solid line 
represents m s (t). The sequence length is / = 10, and we have used R = 0.05 and 
n = T/At = 100 in all four examples. The parameters of the fitness landscape are 
A),stat = e 2 - 4 , e = 2. 

period, T = 100, the quasispecies is at every point in time clearly centered 
around a single peak. The switch from one peak to the other happens very 
fast. When the landscape oscillates with a higher frequency, the transition 
time becomes a larger fraction of the total oscillation period. This makes the 
transition from one peak to the other appear softer in the graphs for smaller 
oscillation periods. For extremely small oscillations, the system perceives the 
average fitness landscape, which is a degenerate landscape with two peaks of 
equal height. As noted above, the quasispecies becomes symmetric in such 
a landscape. In the lower right of Fig. 6, for T = 0.01, we can identify this 
limiting behavior. Both mf(t) and mj(t) are nearly constant over the whole 
oscillation period with an absolute value close to 0.5. The deviation from 0.5 
stems from the finite value of the error rate, R = 0.05 in this example. We 
observe further that m s {t) lies very close to zero, thus wrongly indicating 
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Fig. 7. Order parameter function of the error rate R for various oscillation 

phases <fi = (t mod T)/T. The fitness landscape is identical to the one of Fig. 6, and 
the oscillation period is T = 100. Note that for = 0.10, the error threshold seems 
to have moved towards lower R, which is not the case. Instead, the population is 
symmetric, as would be revealed by graphing m+ or mj. 

a disordered state. Note that the absolute value of mf(t) for a uniformly 
spread population lies, for the parameters of this example, at 0.12 according 
to Eq. (97). 

Observations from the landscape with two oscillating peaks have to be inter- 
preted in the light of results of Schuster and Swetina on static landscapes with 
two peaks [12], who have found that if the peaks are far away in Hamming 
distance (which is the case here), a quasispecies is generally unable to occupy 
both peaks at the same time, unless they are of exactly the same heightQ. For 
two peaks with different heights, the quasispecies will for small R generally 
form around the higher peak. For larger R, however, the quasispecies moves 
to the lower peak if it has a higher mutational backflow from mutants, which 
is the case, for example, if the second peak is broader than the first one. The 
transition from the higher peak to the lower one with increasing R is very 
sharp, and can be considered a phase transition. In a dynamic landscape with 
relatively slow changes, the quasispecies therefore switches the peak quickly 
when the higher peak becomes the lower one and vice versa. 



This is true for infinite populations only. For finite populations, one of the two 
peaks will be lost eventually due to sampling fluctuations. 
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The exact time at which the switch occurs depends of course on the error rate. 
The lower the error rate, the longer the population remains centered around 
the previously higher peak until it actually moves on to the new higher peak. 
Therefore, if we observe the system at a fixed phase and change the error rate, 
the quasispecies undergoes, for certain phases 0, a transition similar to the one 
found in [12] for static landscapes. This is illustrated in Fig. 7, where we display 
the order parameter Trig cts du function of the error rate R. At the beginning of 
the oscillation period the quasispecies is, for all error rates R below the error 
threshold, dominated by the peak corresponding to m s = —1. This must be 
the the replication coefficients of the two peaks intersect at = 0, 

so up to this point the quasispecies has not had a chance to build up around 
the other peak. For phases shortly after = 0, the quasispecies gains weight 
around the other peak, starting from the error threshold on downwards. For 
= 0.15, we observe a relatively sharp transition from the peak corresponding 
to m s = -1 to the peak corresponding to m s + 1 at R xs 0.05. The transition 
then moves quickly towards R = 0, until the peak corresponding to m s = 1 
dominates the quasispecies for all R. For = 0.5, the replication coefficients 
intersect again, and the quasispecies is exactly the inverse of the one for = 0. 



4 Aperiodic or stochastic fitness landscapes 



As we have seen, periodic fitness landscapes can be treated rather elegantly. 
We have been able to define a meaningful quasispecies, as well as to determine 
the general dynamics in the border regions of the parameter space. It would 
be desirable to obtain similar results for arbitrary dynamic landscapes. After 
all, an aperiodic or stochastic change is much more realistic than an exactly 
periodic change. However, the definition of a time- dependent quasispecies is 
tightly connected to periodic fitness landscapes. For arbitrary changes, the 
concept of an asymptotic state ceases to be meaningful. Regardless of that, 
we can derive some results for the border regions of the parameter space. In 
Section 3.1, we derived the formal solution to Eq. (34), 

oo 

y(to + r) = J2 r k Wk(to, r)y(t ) . (99) 

fc=0 

This can be expanded to first order in r as 

y(to + r) = y(t ) + rWi(f , r)y(t ) . (100) 

Obviously, the composition of the sequence distribution changes very little 
over the interval [t , t + r] if the condition 

W 1 (t ,r)|| i «l (101) 
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is satisfied. This observation allows us to establish a general result for rapidly 
changing fitness landscapes. If the landscape changes in such a way that for 
every interval of length r beginning at time to, the average 



Wi(* ,r) 




W(f + r^dn 



(102) 



is approximately the same for every to, and the condition ||Wi(io, t) || <C 1/r 
holds, then the system develops a quasispecies given by the normalized prin- 
cipal eigenvector of the average matrix Wi(t ,r). With "approximately the 
same" we mean that for two times to and t±, the components of the averaged 
matrices satisfy 



with a suitably small e. In other words, if the fitness landscape changes very 
fast but in stationary way, then the evolving population sees only the time- 
averaged fitness landscape. 

For the special case of R = we can, as in Eq. (66), write the solution for the 
quasispecies equation as 



Unlike in the case of a periodic landscape, however, this does not tell us the 
general behavior at R — 0, apart from the fact that for fast changes, the system 
experiences the average fitness landscape. For stochastic landscapes with long 
time correlations, it is hard to make general statements. The reason for this is 
that from long time correlations, we cannot generally deduce that the system 
must be in a quasistatic state. In a landscape that is constant most of the time, 
but displays intermittent sudden changes, the system can be expected to be 
mostly in the quasistatic regime. However, one can easily construct landscapes 
that are in constant flux, and still display long time correlations. Hence, there 
exists no direct equivalent to the case of periodic fitness landscapes with large 
oscillation period for general stochastic landscapes. Nevertheless, we can draw 
a diagram similar to Fig. 2, where for the abscissa, we use the qualitative 
description "slow" and "fast" changes. Under "fast", we subsume everything 
that satisfies the above stated conditions under which the system experiences 
the average fitness landscape, and under "slow" we subsume everything else, 
assuming that a parameter exists that allows a smooth transition from the 
"fast" regime to the "slow" regime. Although Fig. 8 contains considerably less 
information than Fig. 2, the implications for actual landscapes are more or 
less the same. Most real landscapes will have a regime that can be associated 
with slow changes, and hence, we will typically observe phase diagrams of the 
type of either Fig. 3a) or b). 




for all i, j, t , h, 



(103) 



y(t) = exp (jfW) - D(f )]df) y(h) 



(104) 
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Fig. 8. The appearance of a stochastic fitness landscape at the border regions of the 
parameter space. 

As an example, consider the work of Nilsson and Snoad [18], and its subsequent 
extension by Ronnewinkel et al. [23]. Nilsson and Snoad studied a landscape 
in which a single peak performs a random walk through sequence space. The 
peak jumps to a random neighboring position of hamming distance 1 whenever 
a time interval of length Tj ump has elapsed. Ronnewinkel et al. studied a very 
similar fitness landscape, but focusing on deterministic movements of the peak 
that allow for the formal definition of a quasispecies, much as for the case of 
periodic fitness landscapes in Section 3.1. Ronnewinkel et al. could verify the 
results of Nilsson and Snoad on more fundamental theoretical grounds. 

The parameter Tj ump in the jumping-peak landscape determines whether the 
changes happen on a short or on a long time scale. If Tj ump is very large, the 
landscape is static most of the time, and the population has enough time to 
settle into equilibrium before the peak jumps to a new position. If t\ 



jump 



IS 



very small, on the other hand, the peak has moved away long before the pop- 
ulation has had the time to form a stable quasispecies. It was found that, for 
very fast changes, the population fails to track the peak, and selection breaks 
down. Nilsson and Snoad conclude therefore that "dynamic landscapes have 
strong constraints on evolvability" . However, this conclusion may have to be 
reevaluated if we reconsider their landscape from the viewpoint of the general 
theory developed here. In a fast changing landscape, it is the time-averaged 
fitness landscape which matters. In the particular example of a randomly mov- 
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ing peak, this average is in general not very meaningful. However, it allows us 
to render the observations of Nilsson and Snoad plausible, and to understand 
under what conditions these observations would change. If we consider an in- 
terval of length r for which Eq. (100) holds, and assume that the changes in the 
landscape are much faster than r, i.e., r <C Tj ump , then the matrix Wi(£ ,t) 
will contain a number of peaks with roughly the same (small) selective ad- 
vantage over the rest of the landscape. In the succeeding interval, the matrix 
Wi(io + t, t) will have a similar structure, but the peaks will be at different 
positions. In the long run, every single peak position has on average a van- 
ishing selective advantage over every other peak position, and a quasispecies 
cannot form. However, in the rare case that the peak by chance remains in a 
small region of the genotype space for a prolonged amount of time, a momen- 
tary quasispecies will form there. Thus, in the situation Nilsson and Snoad 
have studied, selection does not break down due to the mere fact that the 
landscape is changing fast, but it breaks down because the typical landscape's 
average over some time interval yields a highly neutral landscape. If the peak's 
movements were such that back jumps would be much more likely, or that the 
peak would be confined to a small portion of the sequence space, we would 
clearly see selection. This suggests the viewpoint that the time-averaged land- 
scape determines the "regions of robustness" in the landscape as the regions 
in which even fast changes in the landscape do not destroy the quasispecies. 

In addition to the complete breakdown of adaptation for a fast changing land- 
scape, Nilsson and Snoad observe a sharp decrease of order in their model for 
very low mutation rates. On first glance, one might be tempted to explain this 
observation with an asymptotically flat landscape for R — > 0, as described at 
the end of Sec. 3.1.3. However, this is the wrong explanation in the case of 
a randomly moving peak. The breakdown of adaptation for small R can be 
observed for such large Tj ump that for any non-zero R, the landscape cannot 
be considered flat. The true origin of that effect is the population's conver- 
gence onto a single point in the genome space for these low mutation rates. 
Therefore, the moment the peak position changes, there is no variance in the 
population that would enable it to move over to the new peak. Note that the 
nature of this effect is very different from the normal error catastrophe. The 
error catastrophe occurs because replication is too erroneous to allow for infor- 
mation to be stored in the sequences. As a result, a uniform population forms. 
In the case of small R in the moving peak landscape, however, the catastrophe 
occurs because the replication is too faithful. This means in particular that 
no uniform population forms. Rather, the population always grows to some 
extent on the new peak position, yet the peak does not rest long enough on 
that position to allow the sequences to grow to a macroscopic concentration. 
For the above reasons, we propose to refer to this effect as the convergence 
catastrophe, as opposed to the error catastrophe, in order to point out that 
the breakdown of adaptation in that case is not caused by erroneously copied 
sequences. 
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Another example of a stochastic fitness landscape is a noisy fitness landscape, 
i.e., one in which the fitness peaks have a noise term added to them (which 
means effectively that the population cannot measure the fitness exactly), 
or one in which a fitness peak has a random position. The first case has 
been studied by Levitan and Kauffman [58,59] in the framework of NK land- 
scapes [60,61], while the second case has been studied in the framework of 
population genetics by Gillespie [62]. In both cases, it has been observed that 
the population adapts to the mean fitness, which fits nicely into the general 
concepts we have presented here. 



5 Finite Populations 

In the previous sections, we exclusively studied infinite populations. However, 
as the genotype space generated by even moderately long sequences is ex- 
ponentially large (there are already 10 30 different sequences of length 100, 
for example) , it will be essentially empty for any realistic finite population. 
When most of the possible sequences are not present in the population, the 
concentration variables become useless, and the outcome of the differential 
equation formalism may be completely different from the actual behavior of 
the population. For static fitness landscapes, the effects of a finite population 
size are reasonably well understood. If the fitness landscape is very simple (a 
single peak landscape), the population is reasonably well described by finite 
stochastic sampling from the infinite population concentrations. Moreover, the 
error threshold generally moves towards smaller R with decreasing population 
size [14]. In a multi-peak landscape, the finite population localizes relatively 
quickly around one peak, and remains there, with a dynamics similar to that 
in a single-peak landscape. In the rare case that a mutant discovers a higher 
peak, the population moves over to that peak, where it remains again. The 
main difference between a finite and an infinite population on a landscape with 
many peaks is given by the fact that the infinite population will always build 
a quasispecies around the highest peak, whereas the finite population may get 
stuck on a suboptimal peak. Above the error threshold, a finite population 
starts to drift through genotype space, irrespective of the landscape. 

A finite population on a dynamic landscape will of course show a similar be- 
havior, but in addition to that other effects come into play that are tightly 
connected to the dynamics of the landscape. The most important difference 
between static and dynamic landscapes is the possible existence of a tem- 
porarily ordered phase in the latter case, which is where we should expect the 
major new dynamic effects. 

In the infinite population limit, the temporarily ordered phase generates an 
alternating pattern of a fully developed quasispecies and a homogeneous se- 
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quence distribution. What changes if a finite population evolves in that phase? 
At those points in time when a quasispecies is developed, the finite popula- 
tion's sequence concentrations are given by stochastic sampling from the in- 
finite population result, similarly to static landscapes. As soon as the quasis- 
pecies breaks down (and this may happen earlier than the infinite population 
equations predict, because the error threshold shifts to a lower error rate for 
a finite population), the population starts to disperse over the landscape. Be- 
cause of that, the population may loose track of the peak it was centered at 
previously. Therefore, when the system again enters a time interval in which 
order should appear, the population may not be able to form a quasispecies, 
thus effectively staying in the disordered regime, or it may form a quasispecies 
at a different peak. In this manner, the temporarily ordered phase can cre- 
ate a third possibility for a population to leave a local peak, in addition to 
the escape via neutral paths or to entropy-barrier crossing, which are present 
exclusively in static landscapes [63]. 

5.1 Numerical results 

The numerical results presented below have been obtained with a genetic 
algorithm on N sequences per generation. We have used the following mutation 
and selection scheme in order to remain as close as possible to the Eigen model: 

(1) To all sequences % in time step t, we assign a probability to be selected 
and mutated, 



Here, At is the length of one time step, and rij(i) is the number of se- 
quences of type i. 

(2) We choose N sequences at random according to the above defined prob- 
abilities Pi,mut(0 and Pi t se\(t)}. That means, we perform N independent 
drawings, and in each drawing, a sequence i has a probability Pi >Be i(t)} to 
be copied faithfully into the next generation, and a probability Pi, m ut(0 
to be copied erroneously. The N sequences such generated form the pop- 
ulation at time step t + At. 

Note that we assume generally 




(105) 



(106) 



DAt) < — for all i, t 



(107) 
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so that Pi tSe \(t) defined in Eq. (106) is always positive. 



For an infinite population, such a genetic algorithm evolves according to the 
equation 

x(t + At)=G(x(t),t), (108) 

where x(t) is the vector of concentrations at time t, and G(x, t) is the operator 
that maps a population at time t onto a population at time t + 1, 

G(x,t)= , + 

e* • ([AtA(t) - AtD(t) + l]a;) 

In Eq. (108), we can replace the non-linear operator G(x,t) with a linear 
operator G(y, t), 

G(y,f) = [AfW(t) + l]y. (110) 

The linear operator acts on variables y that are related to the concentrations 
x via 

By comparing Eq. (110) with Eq. (72), we observe that there exists a direct 
correspondence between the genetic algorithm for an infinite population and 
the discrete quasispecies model. This implies in particular that for periodic 
landscapes in the genetic algorithm, the expression for the monodromy matrix 
X(t ), Eq. (75), is exact. 

For a finite population, the operator G(x,t) still determines the dynamics. 
However, the deterministic description Eq. (108) has to be replaced by a 
probabilistic one, namely Wright-Fisher or multinomial sampling. If Gi(x,t) 
denotes the ith component of the concentration vector in the next time step, 
the probability that a population x\ = (mi, m,2, ■ ■ - )/N,J2i m i = N, produces 
a population £C 2 = (^1,^2, • • • )/N,J2i n i — N, in the next time step, is given 
by 

P( Xl ^x 2 ,t)=N\H Gl{Xl f )n \ (112) 

nil 

A proof that the stochastic process described by Eq. (112) does indeed con- 
verge to the deterministic process Eq. (108) in the limit iV — > 00 has been 
given by van Nimwegen et al. [64]. 
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Fig. 9. A single run of a population of N = 1000 sequences of length I = 15 in the 
oscillating Swetina-Schuster landscape. Parameters were Ao(t) = e 2A exp(2 sincji), 
Ai = 1 for i > 0, R = 0.06, T = 100, At = 1. The dashed line indicates the 
theoretical result for an infinite population. 

5.1.1 Loss of the master sequence 

Our first example of a finite population in a dynamic fitness landscape demon- 
strates what happens if in the temporarily ordered phase the master sequence 
is lost due to sampling fluctuations. Fig. 9 depicts a run of a finite population 
consisting of N = 1000 sequences of length / = 15, initialized randomly at 
t — 0, in an oscillating Swetina-Schuster landscape. For comparison, we have 
plotted the theoretical result for an infinite population. The infinite popula- 
tion is always in an ordered state and the order parameter m s never takes on 
values smaller than 0.2. Nevertheless, the finite population is likely to loose 
the master sequence whenever the order parameter of the infinite population 
reaches its minimum, since the error threshold is shifted towards lower error 
rates for finite populations. In our example run, the master sequence was lost 
at the end of the first oscillation period, but it was rediscovered shortly after- 
wards, so that the population could follow the infinite population dynamics 
for most of the second oscillation period as well. Right after a loss of the mas- 
ter sequence, the probability to rediscover the master has its highest value, 
because the population is still centered around the master sequence. Once the 
population has had the time to drift away from the position of the master 
sequence, the probability of a rediscovery drops rapidly. This is what hap- 
pened at the end of the second oscillation period. The population completely 
lost track of the master sequence, and it took the population more than 4 
oscillation periods to rediscover it. This is the main difference between a finite 
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Fig. 10. A single run of a population of N = 1000 sequences in a landscape as given 
in Eq. (93). All parameters were identical to the setup of Fig. 9. The dashed line 
again indicates the theoretical result for an infinite population. 

and an infinite population in the temporarily ordered phase. For an infinite 
population, the interval of disorder has the same well-defined length in each 
oscillation period, whereas for a finite population, once the population has 
entered the disordered state, it may take a long time until an ordered state is 
reached again. In fact, for the case of a single peak in a very large sequence 
space and a small population, the peak may effectively be lost forever once it 
has disappeared from the population. This can be seen as a dynamic version of 
Muller's ratchet [65]. A trait whose advantageous influence on the overall fit- 
ness of an individual is reduced at some point (it is not necessary that the trait 
becomes completely neutral or even deleterious) may be lost from the popula- 
tion due to sampling fluctuations. If at a later stage this trait again becomes 
very advantageous, it is unavailable to the population until it is rediscovered 
independently. However, in most cases a rediscovery is very unlikely. 



5.1.2 Persistency 

Another aspect of a finite population in a dynamic landscape is persistency, 
the tendency of a finite population not to be able to follow the changes in the 
landscape, even though the infinite population limit predicts this. An example 
of that effect is shown in Fig. 10. In that example, we have two alternating 
peaks at opposite corners of the boolean hypercube, as given by Eq. (93). 
Note that the peaks' minimal height is relatively small, but still larger than 
the rest of the landscape's height. In fact, all parameters are identical to the 
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situation shown in Fig. 9, so that this figure can be seen as an example of the 
dynamics around one of the peaks in Fig. 10. The infinite population result in 
Fig. 10 predicts that the population should move to the other peak whenever 
it becomes the higher one. However, the finite population does not follow 
this scheme. It stays localized around one of the two peaks for a long time, 
because a finite population does not occupy all possible points in the sequence 
space at the same time. Therefore, if a peak grows at a mutational distance 
far from the currently occupied peak, no sequence in the population is there 
to exploit the advantage, and hence the new opportunity goes undetected. 
Only if the population looses track of the first peak, which is possible because 
of the temporarily ordered phase, it can discover the second peak during its 
random drift. In the run of Fig. 10, this has happened twice. The first time, 
the population discovered the alternative peak at the end of the drift, and the 
second time, it rediscovered this same peak. 

Let us compare the case of a finite population in a dynamic landscape with 
several growing and shrinking peaks to a rugged, but static landscape. In the 
latter case, once the population has reached a local optimum it remains there, 
unless a rare mutation opens the possibility to move to a new, higher peak. 
The same applies to the dynamic situation. But in addition, the fluctuations 
and oscillations of the fitness values destabilize the population on local optima, 
and allow it to continue its search for other local optima. If the landscape's 
dynamics is such that the population, by following the local optima, moves into 
regions of low average fitness (observed, e.g., in [26]), the landscape might be 
called "deceptive" , and in the opposite case, it might be called "well-behaved" . 

5.2 A finite population on a simple periodic fitness landscape 

5.2.1 The error tail approximation 

In the above examples, we saw that the time it takes until the master sequence 
is rediscovered after it has been lost in the temporarily ordered phase may be 
much larger than the period length of the landscape. Hence, for several periods, 
the population does not follow the infinite population results, but remains in 
a disordered state. It would be desirable to have an analytic description of this 
behavior, and, in particular, to have an estimate of the probability with which 
a complete period is skipped, i.e., with which the master sequence is missed 
for a whole oscillation period. Unfortunately, the continuous time dependency 
of the master sequence's replication rate used in Sec. 5.1, 

A (t) = A ,statexp(esinu;t) , (113) 

renders the corresponding calculations very complicated. Instead, we study in 
this section a simplified fitness landscape that displays a temporarily ordered 
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phase similar to Fig. 9, but that is much easier to handle analytically. For a 
fitness landscape such as Eq. (113), we can — for sufficiently high error rate 
R — divide the oscillation period into two intervals. During the first interval 
Ii, of length Ti, the population is in an ordered state provided that the master 
sequence is present in the population, and during the second interval I2, of 
length T2, the population is in a disordered state, even if the master sequence 
is present. The beginning of the first interval need not coincide with the be- 
ginning of the oscillation period, but after a suitable shift of the time origin, 
this is always the case. Note that for a finite population, the second interval 
is larger than predicted by the infinite population limit, and it may exist even 
if the infinite population limit predicts a length T2 = 0, because the error 
threshold is shifted towards smaller error rates for finite populations [14,66], 
as discussed previously. 

Our approximation here is to keep the fitness landscape constant during the 
intervals ii and I<i- During the interval I±, we let the master sequence replicate 
with rate A$ 1, while all other sequences replicate with A = 1. During the 
second interval on the contrary, the fitness landscape becomes flat, i.e., all 
sequences replicate with A — 1. We continue to study the discrete process and 
set At = 1, so that Ti and T 2 give the number of time steps spent in each 
interval. In summary, the replication rate A (t) satisfies 



a: (f) < Ti 

(114) 

1 : else . 



In order to obtain expressions that can be treated easily even for a finite 
population, we use the error tail approximation introduced in [14]. In that 
approximation, the state of the system is fully described by the concentration 
of the master sequence. All other sequences are assumed to be uniformly spread 
over the remaining genotype space. This approximation underestimates the 
mutational backflow into the master sequence, and hence it underestimates 
the concentration of the master sequence itself, but this small deviation can 
be accepted in the light of the enormous simplifications in the calculations. 

Before studying the finite population dynamics, let us turn quickly to the 
infinite population limit. We express the state of the system at time t by 
a vector x(t) = (xo(t), Xi(t)) t , where xo(t) is the concentration of the master 
sequence, while Xi(t) = l — x (t) stands for the total concentration of all other 
sequences. The generation operator G(x, t) maps the population at time t into 
the population at time t + 1, i.e., 

x(t + l) = G(x(t),t). (115) 



45 



Here, G(x, t) is given by 



G(x,t) = 



[QA(t) + l]x 

A (t)x + xi + 1 ' 



(116) 



Q is the 2x2 matrix 

Q 



2<-l 



(1 - 



(117) 



/ 



and A(t) = diag(A (t), 1). The linear operator G(t) = QA(t) + 1 describes 
the evolution of the variables y(t), 



y(t + l) = G(t)y(t), 
which map into the original variables via 



x(t) 



e"-y{t) } 



(1,1). 



(118) 



(119) 



Hence, the eigensystem of G fully describes the time evolution of x(t). For 
the eigenvalues of G, we find 



Goo + G u ± V(G o - G n ) 2 + 4G iG 



10 



(120) 



where the plus sign corresponds to the index 0, and the minus sign corresponds 
to the index 1. The eigenvectors are 



#0,1 = J 



± 



:u±)\ 



(121) 



with 



Goo — G 



ii 



2G, 



1 1 



01 



G 



01 



4(^00 



Gu) 2 + GqiGio 



(122) 



Of course, the eigenvalues and the eigenvectors are different for the two in- 
tervals l\ and I2. For the first interval, inserting the explicit expressions for 
Gij into Eqs. (120)-(122) does not lead to a substantial simplification of the 
expressions. For the second interval, however, we find for the eigenvalues 



Ai 2) = 2, 



(2) 



2 - 



l-(l-R) 1 
1-2-' 



and for the eigenvectors 



(2-',l-2-')\ 



(123a) 
(123b) 

(124a) 
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4 2) = (!,-!)*■ (124b) 

The superscript (2) indicates that these results are only valid for the interval 
I 2 . From the above expressions, we obtain a simple formula for the evolution 
of the master sequence's concentration during the interval I2 in the following 
manner. Let the interval start at time t, and let the concentration of the master 
sequence at that moment in time be xo(t). Then we find n time steps later 

l0(t + n) -a„(e..^») + ai (AfVAf)"(e*>fV ( ' 

where a and a.\ have to be chosen such that 

x (t) = ao^ + «i0S 2) • (126) 

After solving Eq. (126) for a and a x and inserting everything back into 
Eq. (125), we find 

x (t + n) = 2~ l + [x (t) - 2~'] (l - ■ ( 12? ) 

This formula is sufficiently close to the solution obtained from diagonalization 
of the full 2 l x 2 l matrix Q in a flat landscape, and can be considered a good 
approximation to the actual infinite population dynamics [67]. In principle, a 
similar formula can be derived for the interval I±, but again, the expressions 
become very complicated and do not lead to any new insight. 

Equation (127) demonstrates that a macroscopic proportion of the master 
sequence that may have built up during the interval I± quickly decays to the 
expected concentration in a flat landscape, 2~ l . 

Let us now study finite population corrections. We assume the duration of the 
interval I\ is long enough so that the quasispecies can form. The asymptotic 
concentration of the master sequence can then be calculated from a birth and 
death process as done in [14]. The alternative diffusion approximation used 
in [66] is of no use here because it allows only replication rates Aq of the form 
A = 1 + e with a small e [68]. In [14], the probabilities pk to find the master 
sequence k times in the asymptotic distribution are given by 

Pk = ^ k ~ with Pk = ^Vk-i and po = 1. (128) 

Ei= Pi Mfc 



The probabilities [if and [i i read here 



. N-i 



^00 _ 1 
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and 



= l (effl + [eg- - 1] ^) . (iso) 

The expected asymptotic concentration becomes 

1 w 

x (oo) = — 5ZA;p fc . (131) 

k=0 

Unfortunately, there exists no analytic expression for xq(oo). However, its 
value is easily computed numerically. With the above assumption about the 
length of the interval I±, we can suppose that at the end of I\ the concentration 
of Xo is given by rr (oo). During the interval I2, the concentration of the master 
sequence will then decay. 



5.2.2 The probability to skip one period 



If at the end of the interval I2 the master sequence has been lost because of 
sampling fluctuations, and if in addition to that the correlations in the popu- 
lation have decayed so far that we can assume maximum entropy, what is the 
probability that the master sequence is rediscovered in the following interval 
hi The process of rediscovering the master consists of two steps. The master 
sequence has to be generated through mutation, and then it has to be fixated 
in the population, i.e., it must not be lost again due to sampling fluctuations. 
First, we calculate the probability P m i ss that the master is not generated in one 
time step. This corresponds to the probability that the multinomial sampling 
of the operator G^\x) maps a population x = (0, 1)* into itself. Hence, we 
have 



1 G W (x) ni 

p . = iV' TT { ' 

1 miss 11 
i=0 



Q 
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nil 

N 



1 - 



1 - (1 - R) 1 
2 l+l - 2 



N 



(132) 



G\ 1 \x) stands for the ith component of the outcome of G^(x). 



The probability that the master sequence becomes fixated requires more work. 
Denote by ir(x,t) the probability that the master sequence has reached its 
asymptotic concentration at time t, given that it had the initial concentration 
x at time t — 0. The asymptotic concentration is given by Xq(oo) defined in 
Eq. (131). Then, the probability ir(x,t) satisfies to second order the backward 
Fokker- Planck equation 



dir(x, t) 
dt 



dir{x,t) {(dxo) 2 ) d 2 ix(x,t) 
{dx ) — h 



dx 



dx 2 



(133) 
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The moments (dx ) and ((dxo) 2 ) can be calculated along the lines of [64], and 
we find 

(dx )= QaE, 1} - l) x =: 7x0 , (134) 
((^o) 2 ) = ^^. (135) 
The solution to Eq. (133) for t — > 00 is then obtained as in [64], and we find 

/ 1 \ 2Ar 7 +l 

:= 71 & °°) = Hi-ir 1 (136) 

^1-e" 27 . (137) 

As the initial concentration of rr , we have used 1/N, since it is — for the 
parameter settings we are interested in — extremely unlikely that more than 
one master sequence is generated in one time step. The approximation in the 
second line is only valid for large population sizes. It generally underestimates 
the true value of n^. 

Note that the expression for iToo given in Eq. (136) reaches the value 1 for the 
(relatively large) error rate R close to the error threshold for which xo(oo) = 
1/N. Naively, one would assume that decays with increasing error rate, 
since mutations increase the risk that good traits are lost, and indeed the ap- 
proximate expression in Eq. (137) decays with increasing error rate. However, 
since is the probability that the master sequence reaches its equilibrium 
concentration, and the equilibrium concentration vanishes close to the error 
threshold, Tioo must rise to 1 at the error threshold. 

We have performed simulations with a finite population to test the validity of 
Eq. (136). For a number of runs, we have initialized the population at random, 
but with exactly one instance of the master sequence, and have counted how 
often the master's concentration reached xq{oo) and how often it reached 0. 
The result of these runs are shown in Fig. 11. Clearly, numerical and analytical 
results are in good agreement. 

Finally, we need an estimate of the time r it takes from the time the master 
sequence is discovered to the time in which the equilibrium concentration is 
reached for the first time. We again follow the calculations in [64] , and assume 
that the process of fixation can be treated in the infinite population limit. 
From Eq. (116), we obtain for the change in the variable Xo(t) during one 
time step in the interval I\ 

~(a - l)x (t) 2 + (Q 00 a - ggi - l)x (t) + Q 01 
x (t + 1) - x (t) - (a - l)xo(t) + 2 • (138) 
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This can be approximate with a differential equation, 

dx (t) 



dt 



x (t + 1) - x (t) , 



(139) 



which we can solve for t as a function of x to obtain 



6 + 4 



( k ,b-2sx A ,b-2s/N\ 
Atanh - Atanh - — 



z ) 
1 . -sx\ + bx + Qqi 
~2 11 -s/W + b/N + Qqi 



with 



s = a — 1 , 



and 



z = yj4sQ 01 + b 2 . 



(140) 



(141) 
(142) 



(143) 



Therefore, for the estimated time it takes until the master sequence becomes 
fixated we will use in the following 



= t(rr (oo)) , 



(144) 



with x (oo) given in Eq. (131). 



We can now calculate the probability that the population skips a whole period, 
i.e., that it does not find and fixate the master during one interval I±. The 
probability that the master sequence has concentration zero at the beginning 
of the interval I\ is (1 — l/2 l ) N . Therefore, the probability that the master 
sequence is not fixated in the first time step is 



1 - 



1-1- 



1 

2 1 



7T 



(145) 



The probability that the master sequence is not found and subsequently fixated 
is given by 

1 - (1 - PmissKo ■ (146) 

Now, if the master sequence is found, it will roughly take the time r given 
in Eq. (144) until the equilibrium concentration is reached. Therefore, if the 
master sequence is not found during the first T\—r time steps, it normally will 
not reach the equilibrium concentration in that period. Therefore, in order to 
calculate the probability P s ki P (^i) that the whole interval 1\ is skipped, we 
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Fig. 11. The fixation probability function of the error rate R for three 

different heights of the peak. The solid lines stem from the analytic expression 
Eq. (136), and the dotted lines stem from measurements on a finite population 
consisting of N = 500 sequences. 

have to consider only the first T\ — r time steps of I\. In case that T\ < r, we 
have P s kip(Ti) ss 1. The equality is only approximate, because r is the average 
time until fixation occurs. In rare cases, the fixation may happen much faster. 

Of the Xi — r time steps, the first one is different because during that time step 
we do not know whether the master sequence is present or not, whereas for 
the remaining T\ — t — 1 time steps we may assume that the master sequence 
is not present if fixation has not occurred. Therefore, we find 



Figure 12 shows a comparison between this result and numerical simulations. 
The simulations were carried out by letting a randomly initialized population 
evolve in a flat landscape for 100 generations, and then recording the time it 
took the population to find and fixate a peak that was switched on in gener- 
ation 101. We observe that the analytic expression for P s ki p (7i) predicts the 
right order of magnitude and the right functional dependency on 7\, but that 
it generally underestimates the exact value. Since Eq. (147) contains three 
quantities for which we have only approximative expressions, namely P m i SS , 




[1 - (1 - PnissKoo] 



Tl-T-1 



(147) 




exp [-(7\ - r) (1 - P miss ) vrj . (148) 
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Fig. 12. The probability -P s kip(Ti) that the population skips a whole period without 
fixating the master sequence, as a function of the length of the interval I\, for several 
different settings of N and R. The string length is I = 15. 

7TOO, and r, at first it is not clear from where these discrepancies arise. How- 
ever, a systematic check quickly reveals the main cause of the discrepancies. 
First of all, note that r merely shifts the curve to the right. Since the measured 
and the analytic curves reach the value 1 at very much the same positions in 
Fig. 12, we can assume that r, as given by Eq. (144), is accurate enough for 



52 



our purposes here. Now consider the quantity rc^. In Fig. 11, we saw that our 
expression for generally gives a good estimate of the true value, but that 
there are some deviations. To check whether these deviations are responsible 
for the discrepancies visible in Fig. 12, we show additionally P s ki P (Pi) with ir^ 
determined numerically. We find that using a numerical enhances the pre- 
diction of Eq. (147), in particular for larger error rates. For small error rates, 
however, the discrepancy is still sizable. Moreover, the analytic expression is 
generally getting worse for smaller error rates. We can thus conclude that the 
main problems arise from the expression for P m i SS , Eq. (132). Indeed, we have 
derived P m i SS under a maximum entropy assumption, i.e., we assumed that all 
mutants are distributed homogeneously over sequence space. Under this as- 
sumption, the probability to find the master is exactly the same at every time 
step. But in reality, the population collapses very rapidly, even in a neutral 
landscape, and then moves about as a cluster whose radius is determined by 
the error rate. This introduces very long-range temporal correlations in a pop- 
ulation evolving in a flat landscape [69]. In particular, for small error rates the 
cluster is very small, and this can increase the probability P m i SS substantially. 
Note that this effect corresponds to the underestimation of epoch durations 
that van Nimwegen et al. found in their analysis of the Royal Road genetic 
algorithm [64]. An exact treatment of this effect would probably have to occur 
along the lines of [69]. Unfortunately, we cannot simply use their expressions 
here, because of the term +1 present in our definition of the operator G(x,t) 
[Eq. (116)]. 

In order to check the hypothesis that the violation of the maximum entropy 
condition causes the main discrepancies shown in Fig. 12, we performed ad- 
ditional simulations in which we dispersed the population "by hand" over the 
complete sequence space in every time step in which the master sequence was 
not discovered. With this setup, we found a very good agreement between the 
numerical and the analytical results. 

What can we conclude from Eq. (147)? First, note that the true P s ki P (Pi) must 
always be larger than the value predicted by Eq. (147), because the deviations 
from that value are caused by the population's collapse into a small cluster. 
Hence, Eq. (147) is a lower bound on P s ki P (Pi), and rediscovery of the peak is 
less likely than what Eq. (147) predicts. According to our prediction, P s ki P (Pi) 
decays exponentially. This means that the probability to find the peak in one 
oscillation period, 

P fi nd(T 1 ) = l-P skip (T 1 ), (149) 

approaches 1 for large T\. This is due to the fact that the peak will certainly be 
rediscovered if only we wait long enough. However, the model we are studying 
here is that of a peak that is switched on and off alternatingly, and for which 
each "on" -period is of fixed length T\. In that case, the probability to rediscover 
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the peak within one oscillation period can be extremely small, as we are going 
to see now. P s ki P (7i) decays with a rate of (1 — -PmissVoo- We can neglect ir^ 
here, as it is of the order of one. Then, the decay rate is for fixed N and large 
/ is approximately given by 



i.e., it decays as 2 . This implies in turn that already for string lengths of 50- 
60 (which can be considered a rough lower bound for typical DNA sequence 
lengths) and moderate N and R, we find Pfind(^i) ~ for moderate T\. Hence, 
in many cases it is extremely unlikely that the peak is rediscovered at all. 

The above conclusion is of course tightly connected to the fact that we have 
studied a landscape with a single advantageous sequence. In the other extreme 
of a wide-peak (Mount Fujiama) landscape, in which the population can sense 
the peak from every position in the sequence space, the conclusions would be 
different. Note, however, that neither the single-sharp-peak landscape nor the 
wide-peak landscape are realistic landscapes. In a realistic, high-dimensional 
rugged landscape, it is probably valid to assume that local optima, once they 
are lost from the population, are never rediscovered. In such situations, dy- 
namic fitness landscapes can induce the loss of a local optimum, and thus, 
they can accelerate Muller's ratchet [65] like effects. 



6 Conclusions 

In this report, we have derived several very general results about landscapes 
with periodic time dependency. First of all, a quasispecies can be defined by 
means of the monodromy matrix. This means that after a sufficiently long 
time, the state of the system depends only on the phase = it mod T)/T of 
the oscillation, but not on the absolute time t. Therefore, in periodic fitness 
landscapes, the quasispecies is not a fixed mixture of sequence concentrations. 
Instead, it is a T-periodic function of mixtures of sequence concentrations. We 
have given an expansion of the monodromy matrix in terms of the oscillation 
period T, which leads to an extremely simple description of the system for 
very high oscillation frequencies. Namely — if we assume the mutation matrix 
remains constant at all times — the time-averaged fitness landscape completely 
determines the behavior of the system, which is essentially indistinguishable 
from a system in a static landscape. This leads to the important conclusion 
that selection never ceases to act, no matter how fast the landscape changes. 
The only exceptions to this rule are due to dynamic landscapes that have a 
completely flat average. In that case, the system for very fast changes behaves 
as being subjected to a flat landscape, which is indistinguishable from the 
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behavior of a system above the error threshold. Therefore, if the average land- 
scape is flat, selection will break down if the changes occur with a frequency 
higher than some critical frequency 00* = 2n/T*. For very slow changes, on the 
other hand, the system is virtually in equilibrium all the time. This leads, in 
general, to a time dependent error threshold R*(t). For mutation rates R such 
that min t R*(t) < R < max t R*(t), the system is below the error threshold for 
some times t, and it is beyond the error threshold for other times. We have 
dubbed this region of the parameter space the temporarily ordered phase, as 
we see alternating patterns of order and disorder in that phase (in the infinite 
population limit). We found these general considerations to be in complete 
agreement with all example landscapes that we studied. 

Periodic fitness landscapes can be fully understood from the knowledge of the 
monodromy matrix. Therefore, in future work it should be tried to obtain an 
improved understanding of the properties of that matrix. In particular, an 
expansion of that matrix in the error rate R would help to further develop the 
schematic phase diagrams introduced in Sec. 3.1.3. 

While the molecular concentrations become T-periodic for t — + 00 in the infi- 
nite population limit, this is not necessarily the case when we consider finite 
populations. In the temporarily ordered phase, after a population has made 
the transition to the disordered state, it may not transition back to order as 
the infinite population would. Rather, once the population has lost the ordered 
state, it is often difficult for the population to return to it. From a very simple 
analytical model, we found that the probability that the ordered state is not 
rediscovered in one oscillation period decays exponentially with the length of 
the interval in which order is possible at all. The decay constant, however, is 
extremely small for large I, and therefore the rediscovery can become very un- 
likely. In more complex landscapes, this can lead to an acceleration of Muller's 
ratchet. 

For the case of non-periodic landscapes, we have argued that the main con- 
clusions remain valid, even if our mathematical formalism is not generally ap- 
plicable in that case. Fast changes in the landscape will average out, whereas 
slow changes lead to a quasistatic adaption of the quasispecies to the current 
landscape. 

Throughout this report, we have assumed that mutations arise in the copy 
process. An equally valid assumption is that of mutations arising on a per- 
unit-time basis (cosmic ray mutations), as opposed to the per generation basis 
implied by copy mutations. With the latter assumption, one has to study the 
parallel mutation and selection equations [19] instead of Eigen's equations. 
Since these equations can be linearized in the same way as the quasispecies 
equations, the formalism we developed applies to these equations also. The 
only difference between the two types of equations is that in the case of cosmic 
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ray mutations, the mutation matrix Q and the replication matrix A are added, 
whereas in the quasispecies case they are multiplied. 



A question that must remain unanswered within the current body of knowl- 
edge is to what extent dynamic fitness landscapes help with the progress of 
evolution. In Sec. 5, we have seen that the dynamics of a fitness landscape 
can destabilize a population on a local peak. On the one hand, being trapped 
in a local optimum is regarded as one of the main hindrances to the progress 
of evolution, so that the destabilizing effect seems to advance evolution. On 
the other hand, the same effect can lead to the loss of an advantageous trait. 
Whether the positive or the negative aspect prevails depends most certainly 
on details of the landscape. In a study of adaptive walks on dynamic NK land- 
scapes, exactly this question was addressed [40]. In that particular case, it was 
found that for a rapidly changing landscape, the loss of traits was dominant, 
whereas a slowly changing landscape could lead to a more efficient exploration 
of the high-fitness regions of genotype space. Apart from this particular study, 
however, the amount to which a dynamic landscape can advance the progress 
of evolution is unknown, and deserves more attention in future work. 
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A High-frequency expansion of X(£) for a single oscillating fitness 



Eq. (52), gives an expansion of the monodromy matrix for periodic landscapes, 
X(£q), in terms of the period length T. Here, we calculate the expansion ex- 
plicitly up to second order for an example landscape. We choose a landscape 
with a single oscillating peak. The replication rates are 



In this particular example, we set the decay rates to zero. With vanishing 
decay rates, the matrix W(t) reduces to QA(£), and as a consequence, we can 
write the nth average W fc (t) as 



peak 




(A.la) 
(A.lb) 



Mt) = 1 



for alH > . 



(W fc (f)) . . = J2 u i J2 v i ' ' ' 2 u k-iQimQmn ■ ■ ■ Q» k -d A > 



"l,"2,-^-lJ 



(t) (A.2) 
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with the generalized replication coefficients 



— 1 f T 

A„ u ...„ k _ uj (t) = 7^ J q Avi(t + ri) • • • 

• • • A^to + r fc _i) H' 1 Aj(t + T k )dn ■■■dr k . (A.3) 
Jo Jo 

For the landscape given in Eq. (A.l), the first order tensor of the generalized 
replication coefficients has two independent elements, which are (assuming 
i > 0) 

A (t) = a, (A.4a) 
A(t) = l. (A.4b) 

The second order tensor has four independent entries. After some algebra, we 
obtain (assuming again i > 0) 

— a 2 

Aoo(t) = — , (A.5a) 

Aoi(t) = \-7^ cos(cut) , (A.5b) 

A i0 (t) = 7: + 7T cos(cut) , (A.5c) 

Mt) = X - . (A.5d) 

In principle, the generalized replication coefficients A vljV2j _ Uk _ u j(t) can be cal- 
culated to arbitrary order for the landscape given in Eq. (A.l). However, the 
third order tensor has already 8 independent entries, and with every higher 
order, the number of independent entries doubles. 
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